Alternation: P/F-s -> 'to alternate with', 'to take turns with' E.g. The girls take turns with the boys at the swimming pool. Red flags alternate with blue flags in the row of flagpoles. [Do not confuse 'alternative' with 'alternation'. An alternative is a substitute while an alternate precedes or follows in temporal or locative sequence. For example, a temporal alternative relationship implies a substitute at the same point in time thus negating the first argument, while the alternation relationship implies a sequential substitution.] Precedence/Sequence: P/F-s -> 'to precede', 'to come before/in front of', 'to be earlier in sequence than' Inverse F/P-s -> 'to follow', 'to come after/behind', 'to be later in sequence than' E.g. January precedes February. February follows January. The wrist comes before the hand, the lower arm comes before the wrist, and the elbow comes before the lower arm. Contingency: P/F-s -> 'to be contingent/conditional on', 'to hinge on', 'to depend on' E.g. The success of the project depends on complete cooperation. Inverse F/P-s -> 'to entail/imply' 'He shouted again' entails 'He shouted earlier'. Lightning implies thunder. [Important: do not confuse 'contingency/implication' with 'causation'.] Meaning: P/F-s -> 'to mean', 'to signify', 'to stand for', 'to denote' E.g. The French word "maison" means 'house'. His behavior signifies that he is very angry. Other kinds of general relationships which we've already discussed are the P/F and AP/F generic verbs. Note that NONE of the general relationships represent mensuration or evaluation concepts. Note also that all of the technical labels that we introduced above, such as "paronym", "superordinate", and "meronym", can be easily derived from the active and inverse forms of the corresponding verbs. 22.1 THE VERB "TO HAVE" The generic P/F-s state verb "masi" indicates that a relationship exists, but implies nothing about the nature of the relationship. Thus, its meaning includes ALL of the above P/F-s examples. And, as we discussed earlier, the open adjective form "mabie" provides the genitive sense of the English word "of", the Japanese word "no", the Chinese suffix "-de", the Swahili root "-a", and so on. In its verb form, this generic state relationship completely overlaps the sense of the English verb "to have". To illustrate this, consider the following examples: John has the book. John's book... The project has a new manager. The project's new manager... The house has a red roof. The house's red roof... He has a good reputation. His good reputation... We had problems with the new equipment. Our problems with... John has an answer to your question. John's answer... I had supper at 6 o'clock. My supper... In other words, the semantics of the genitive and the verb "to have" are identical, and encompass much more semantic space than the prototypical sense of 'possession', 'ownership', or 'control'. Thus, the verb "masi" is in most respects the equivalent of the English verb "to have". [The English verb is different when it is used as an auxiliary, and when it is used with the causative sense of "I HAD Joe sweep the garage". Also, the meaning of the word defaults to 'possession/ownership' whenever the actual relationship is not clear from context, and this default appears to be universal among natural langauges.] Earlier in this monograph, when we discussed the genitive CCM "-xa-", we indicated that when "-xa-" is used in root position it would represent the concept of 'having' with a default class of P/F-s. Thus, the words "masi" and "xasi = xamasi" are precisely synonymous. While this may seem a rather useless and redundant application of "-xa-", it CAN have its uses if it undergoes further modification. In other words, when no root is present (as in "masi"), we assume a generic state root (compare with "zesi", where "-ze-" is the generic action root). And since the root is not present, it cannot be further modified. However, when further modification IS necessary, we can use "-xa-". For example, "masi" = "xasi" both mean 'to have', but to indicate stronger or weaker senses of association, we can NOT modify "masi", but we CAN modify "xasi". Thus, "xagesi" means 'to have a lot to so with', "xapisi" means 'to be completely/totally involved with', "xajusi" means 'to be minimally involved with', etc. (The word "xagesi" can also have the sense of the English word "possess" when it does not imply absolute 'control' or 'ownership'.) We will see additional examples below. Now, some languages make a distinction between _alienable_ and _inalienable_ possession. Alienable possession implies that the possession is inherently temporary (e.g. "John's money"), while inalienable possession implies that the possession is typically permanent (e.g. "John's arm"). If you wish, you can implement this distinction by creating an inalienable possessive root and a corresponding CCM. Note, though, that the adjective version of basic nouns already implies the most common type of inalienable possession: "guasuda" = 'duck' 1. "guasuxano parade" OR "parade mabie guasuda" = 'duck parade' or 'duck's parade' 2. "guasuno parade" = 'duck parade' The first example is vague and implies only that the parade was somehow associated with ducks. Thus, the first example could have several meanings, such as that the parade consisted of ducks, that the parade contained people dressed like ducks, that the parade was done for the benefit of ducks, etc. The second example, however, is clearly inalienable, since it means literally 'the parade which BE duck'. In other words, the second example clearly indicates that the parade CONSISTED of 'duck'. Finally, I mentioned earlier that the root morphemes used to derive the above general relationships would be very productive in deriving additional words. Let's do some additional derivation of the most useful relationship - the generic state relationship represented by the verb meaning 'to have': P/F-s masi -> to have P/F-s namasi -> to not have, to be bereft of AP/F-s fisi -> to keep, to retain xapifisi -> to monopolize, to hog AP/F-s nafisi -> to abstain from, to avoid, to refrain from, to forgo P/F-d dosi -> to get, to obtain, to receive, to come into/by P/F-d nadosi -> to lose, to forfeit AP/F-d suasi -> to accept, to take, to obtain xagesuasi -> to secure, to reap AP/F-d nasuasi -> to give away, to get rid of AP/F-p namisi -> to offer, to present A/P/F-d kosi -> to give A/F-d [+P] kogasi -> to transfer, to convey, to give (to) xagekogasi -> to yield up, to relinquish, to give away A/P-d [+F] nakomiusi -> to dispossess, to expropriate A/P-s [+F] natuemiusi -> to deprive Note that I have implemented the above with the same argument structures as the corresponding English verbs. For example, "kogasi" is anti-passive because the English verb "to transfer/convey" is inherently anti-passive. Also note that the above glosses are completely compatible with the generic derivations of chapter 2. For example, the verb "fisi" means 'to keep' not only with the above sense, but also with the sense 'to remain steadfast on' or 'to stick with'. It's important to keep in mind that the above relationships are extremely general. Thus, "suasi" meaning 'to take' has the same wide semantic coverage as the English verb, and can be used in a wide range of contexts. Here are some examples: I took her advice. I took her hand. I took her money. I took the first place in line. I took the elevator to the tenth floor. For the verb meaning 'to transfer', the primary, oblique patient is the entity to which the focus is being transferred. Thus, a sentence like "John kosi the land gape Bill", would mean 'John transferred the land to Bill". [Note that the anti-passive CCM "-ga-" is not needed on the verb because the original patient is being expressed obliquely using the case tag "gape".] For the AP/F-d verb meaning 'to take', the secondary agent-patient can be indicated using the generic 0/AP case tag "piupe", which we discussed earlier. For example, "John suasi the book piupe Tom" would mean 'John took the book FROM Tom'. It would also be possible to use the locative 'from', "menadope", to emphasize that the transfer involved a change of location. 23.0 CONJUNCTIONS A conjunction links two entities or situations, and always provides additional information about the relationship between the items being linked. Also, some conjunctions can be concatenated to link more than two items. Here are a few examples: Louise AND Bill just left. Louise OR Bill OR Mike will give the talk. Bill will go shopping IF Louise wants him to. John just went shopping, BUT he forgot to buy coffee. He bought the book EVEN THOUGH it was very expensive. He was the only one who was sober, SO he had to drive. He finished his homework at 7 PM, AND THEN he went outside to play. Bill missed the target; IN OTHER WORDS, he lost the match. Conjunctions ALWAYS link two expressions of the same syntactic type. For example, if a noun phrase immediately follows a conjunction, the conjunction links it to one or more preceding noun phrases. If a complete sentence immediately follows a conjunction, the conjunction links it to one or more preceding sentences. And so on. Conjunctions can be grouped into the following general categories: Additive: and, also, in addition, besides, furthermore, moreover, similarly, likewise, in the same way, in other words, in conclusion, in summary, etc. Causal: if, then, unless, even if, so, consequently, thus, it follows, because, under the circumstances, for this reason, therefore, etc. Concessive/Adversative: but, and even, in spite of, however, although, albeit, notwithstanding, anyway, nevertheless, even though, regardless, even so, despite, just the same, even now, for all that, still, all the same, yet, whether or not, whatever, no matter what, in fact, as a matter of fact, despite that, on the other hand, etc. Substitutive: or, instead of, rather than, in place of, etc. Temporal: then, next, after that, on another occasion, an hour later, finally, afterwards, before that, at last, at the same time, subsequently, etc. Continuatives/Cohesives: uh, now, well, anyway, okay, at any rate, in any case, etc. [Incidentally, the above categories reflect LINGUISTIC/DISCOURSE distinctions based on actual usage in natural language, as opposed to LOGICAL distinctions. Logicians categorize conjunctions quite differently, and, in the process, end up excluding words and expressions that are truly conjunctive in nature, or end up restricting their meanings more than natural languages do. For example, most logicians and formal semanticians would not consider expressions such as "in other words", "afterwards", "on the other hand", and "anyway" as actual conjunctions, because they do not perform basic logical operations on truth conditions. In natural language, however, these ARE conjunctions and they perform important conjunctive discourse functions.] Conjunctions are interesting because of their large numbers and because of the great variety of relationships that they represent. Also, the vast majority of them are derived from basic, open class words. Thus, while conjunctions DO perform a function that is quite different from verbs, nouns, adjectives, etc., their meanings include the concepts of many of these words. 23.1 IMPLEMENTING CONJUNCTIONS Conjunctions fall into two general categories depending on how they are used in discourse. 1. True conjunctions. These always link a constituent which follows the conjunction with the closest preceding constituent of the same type. The linkage is thus syntactically precise. Examples: and, or, unless, if, etc. 2. Disjuncts. These only loosely link a constituent which follows the conjunction with one or more of the preceding constituents of the same type. The syntactic linkage is often vague. Examples: but, on the other hand, also, in other words, despite that, etc. To illustrate the difference between true conjunctions and disjuncts, consider the following: The project was over-budget and under-staffed. The project manager was a political hack and his choice for a tech lead was a bureaucrat who could barely spell his name. Three of the engineers and four of the secretaries were sick most of the time. To make matters worse, the technicians had to spend most of their time on another project that had higher priority and more adequate funding. But the project was a great success. Notice how "and" precisely links its arguments, creating new constituents of the same syntactic type. The syntax of the linkage is not in doubt. But there IS doubt about the linkage of the word "but". Does it link to the immediately preceding sentence, to the preceding two sentences, or to the entire preceding paragraph? If "but" were a true conjunction, there would be no doubt about which items were being linked. In effect, the semantics of "but" in the above example is not compatible with the syntax of a true conjunction since the linkage is not clear. The actual linkage can only be determined through context. Now, since true conjunctions and disjuncts are syntactically distinct, we must treat them as distinct syntactic entities; i.e. we must give them different parts-of-speech in the sample language. And since true conjunctions are relatively rare, the obvious solution is to create unique particles (terminator "-ka") for each one. But how do we handle disjuncts? Fortunately, there is a way of achieving a disjunctive effect within the existing framework. Earlier, we discussed how a speaker could express his feelings or attitudes about an event by using a verb that takes an entire sentence as its single argument. Here's an example: P/F-s I hope that I'll win. F-s [-P] Hopefully I'll win. where "hopefully" is actually a verb that takes a complete embedded sentence as an argument - it is NOT an adverb as in English. As I stated earlier, words and expressions like these are called _disjuncts_, and many other examples can be derived in the same way: "to presume" -> "presumably", "to be interesting" -> "interestingly", "to be possible" -> "possibly", "to be incidental" -> "incidentally, by the way", "to be necessary" -> "necessarily", "to be fortunate" -> "fortunately", etc. The same approach can be used to create disjunctive conjunctions whose scope is not precise. For these, however, we must demote the SECOND argument of the verb rather than the first argument using an anti-middle construction (CCM = "-xi-"). Here is an example in the sample language: P/F-s: The new project is similar to the previous one. where "losi = lomasi" = 'to be similar to' P-s [-F]: "Loxisi" = 'Similarly, ...', 'Likewise...', 'In like manner', etc. Alternatively, we can use the terminator "-fo", which we used earlier for sentential register, tense/aspect words, and modal words. When used with other roots, it will apply the corresponding state to the entire sentence or clause. Thus, the above example could also be: "Lofo" = 'Similarly, ...', 'Likewise...', 'In like manner', etc. Note that this usage of "-fo" is perfectly consistent with its use with register, tense/aspect, and modal morphemes. Again, the implied argument is inherently deictic, since it can always be determined from the speech situation. Here are some more English examples: P/F-s: The bazaar was in addition to the car wash. P-s [-F]: Additionally, ... P/F-s: The land swap was an alternative to continued violence. P-s [-F]: Alternatively, ... P/F-s: The accident occurred after the party. P-s [-F]: Afterwards, ... P/F-s: His odd behavior meant that he was angry. P-s [-F]: In other words, ... P/F-s: Red flags alternated with white ones. P-s [-F]: On the other hand, ... As with other disjuncts, the P-s [-F] word is a VERB that has the part-of- speech terminator "-si" or a clause modifier that has the part-of-speech terminator "-fo". It is NOT a true conjunction ending with "-ka"! The disjunct takes the entire sentence that follows as its single argument. In sum, a true conjunction should be used only when its linkage is clear. This will almost always be the case when the items being linked are part of the same sentence. In general, a disjunct should be used to introduce a sentence that is only loosely linked to the preceding ones. Now, most of the basic relationships we discussed in the previous section will provide the roots for the most common conjunctions. For example, the conjunction meaning 'and' will be formed from the root for the 'supplementation' relationship plus the terminator "-ka". Similarly, the conjunction meaning 'or/instead' will be derived from the 'alternative' relationship. The conjunctions meaning 'if/then' will be derived from the 'contingency' relationship. And so on. The same roots can also be used to derive disjunctive verbs and adverbs. For example, the root used to derive the true conjunction meaning 'and' can also be used to derive the disjunct meaning 'additionally' and the simple adverb meaning 'too/also'. The disjunct meaning 'but/still' can be derived from the 'incompatibility' relationship. And so on. [Incidentally, you may be tempted to derive the conjunction meaning 'but/still' from the modality meaning 'even', since the modal implies that what happens is unexpected or incompatible with the norm. However, this would not be correct because the two concepts are really distinct, although one often implies the other. The basic relationship actually COMPARES the patient with the focus, while the modal describes the patient's ATTITUDE about the focus. However, the derivation from the modal remains useful, and would mean something like '(and/but) unexpectedly' or '(and/but) surprisingly'.] Some of the more complex temporal combinations (e.g. "three days later", "on a different occasion", etc.) can be implemented using a combination of numeric and temporal roots/MCMs. However, expressions like these are almost never true conjunctions or disjuncts - they are usually topicalized components of the sentence which follows. I will discuss how to deal with this later, in the section on _topicalization_. Finally, keep in mind, that these relationships have verb forms which may actually be easier to use than their true conjunctive or disjunctive forms. For example, the P/F-s verb form of the contingency relationship (or its inverse) can be used in all 'if/then' situations. Here's an example: (the robber will go to prison) is contingent on (he is guilty) = The robber will go to prison if he is guilty. or (the robber is guilty) entails (he will go to prison) = If the robber is guilty, then he will go to prison. The actual form of the embedded sentences will, of course, depend on the sytnax of the AL. 23.2 NOTES ON USING CONJUNCTIONS Languages often use different conjunctions in different environments, even though they represent the same semantics. For example, English favors "and" for linking noun phrases and clauses, while preferring "also" when linking sentences. Also, English never uses "but" to link noun phrases. Instead, it will split the sentence into two clauses and link them with "but...too/also": *John but Bill helped the children. John helped the children, but Bill helped them too. In my opinion, there is no need to duplicate the selectional preferences of a particular natural language. Thus, I WOULD allow the following: John but Bill saw Louise. = John saw Louise, but Bill saw her also. John if Bill will go shopping. = John will go shopping if Bill will go too. John even though Bill went shopping. = John went shopping even though Bill also went. And so on. Note though, that the above approach can only be used with true conjunctions (i.e., those terminated by "-ka"). It cannot be used with disjuncts. 23.3 REGISTER VARIATIONS FOR DISJUNCTS There are many different disjuncts that have essentially the same meanings, but which are used in different settings. Natural languages differ widely in the number and nature of these expressions. Fortunately, we can capture these distinctions without having to arbitrarily create words that will have few close counterparts in other languages. We can do this by simply changing the speech register of the more basic disjuncts by using the register MCMs we discussed earlier. Here are a few examples: from 'and/also/too' informal -> 'besides' formal -> 'in addition' very formal -> 'furthermore', 'moreover' from 'but/still' informal -> 'whatever', 'even so', 'for all that' formal -> 'though', 'although', 'however' very formal -> 'nevertheless', 'regardless', 'notwithstanding' from 'even though' formal -> 'despite that' very formal -> 'in spite of the fact that' from 'well/so' informal -> 'okay', 'so anyway', 'so anyhow', 'anyway', 'anyhow', 'okay then' formal -> 'now', 'in any case' very formal -> 'at any rate', 'in any event' from 'then (= thus)' informal -> 'because of this', 'for this reason' formal -> 'thus', 'therefore' very formal -> 'it follows therefore that', 'consequently', 'hence' And so on. The actual distinctions between informal, formal, etc. will vary somewhat from one person to another, and the above examples reflect my own (subjective) conclusions. (Actually, I doubt if it's possible to PRECISELY define the semantics of these register differences.) 23.4 COORDINATION AMBIGUITY Conjunctions can be used to solve problems that sometimes show up if the syntax of your AL is strict and unambiguous. For example, if your syntax requires a relative clause to always attach to the closest preceding noun, you would not be able to render the following as a single sentence: I told him about the chicken that we had for supper that was killed by a coyote. If the syntax of your AL is strict (as it is in the sample language), then the relative clause "that was killed by a coyote" would modify the noun "supper", which is nonsense. With a conjunction, however, the problem disappears: I told him about the chicken that we had for supper AND that was killed by a coyote. Here, "AND" links the two "that" clauses so that both modify "chicken". If a relative clause modifies a noun phrase that is part of a coordinated pair, the linkage may be ambiguous. Consider the following: 1. The boy and (the girl who ran away)... 2. (The boy and the girl) who ran away... Personally, I would define the syntax of my AL so that relative clauses would modify only the single closest noun phrase by default, and so that conjunctions would link the following item with the closest preceding item of the same type. Thus, without further information, number 1 would be the only possible interpretation. If you want the relative clause to apply to the compound phrase, you could modify the relative conjunction with a modifier meaning 'both' or 'all', or something similar. However, this is not a very good solution, since parsing success will now depend on the MEANING of the words in addition to the syntactic relationship between them. If you want your AL to be computer- tractable, parsing must depend ONLY on syntax. Fortunately, there is a solution that is natural and easy to implement. First, though, consider some English examples: The boy and the girl, both of whom ran away, ... Jim, Bob, and Joe, all three of whom were in the accident, ... In effect, the expressions "both of" and "all three of" terminate the coordinated structure and allow further modification. Thus, a simple, comprehensive, and versatile solution would be to create a single terminating particle that can undergo further modification. In the sample language, I will use the particle "saksika" for this purpose, with the general sense 'all (of whom)'. Here are some examples: The boy and the girl with the red balloons... -> Only the girl has the red balloons. The boy and the girl saksika with the red balloons... = The boy and the girl, both of whom have red balloons... It's important to keep in mind that ambiguities, such as in the first English example, are often language-dependent. In other words, if you translate a sentence from language X to language Y, it may be ambiguous in one language but not in the other. However, as a language designer, you have a choice - you can eliminate ambiguity by simply not allowing it to exist. And you can do that by providing only one possible interpretation for a particular structure. Thus, the first example above is ambiguous in English but NOT in the sample language. In the sample language, only the girl has the balloons. In order to indicate that both the boy and the girl have balloons, the particle "saksika" MUST be used. Now, consider the following two sentences, and note how the parentheses indicate how the constituents are grouped based on their most likely interpretations: (The boy with the red hat) and (the girl with the puppy)... The boy with ((the lunchpail) and (the book with the missing cover))... Syntactically, the two examples are identical, but a human listener would group the constituents differently. In the sample language, the adjectival phrase "with a missing cover" modifies the noun "book", and the conjunction "and" links the noun phrases "the lunchpail" and "the book with a missing cover". Thus, the grouping shown in the second example is correct, while the grouping shown in the first example is wrong. The reason why the first example is not ambiguous in English is because it's the only grouping that makes sense. However, it is possible for the same structure to be ambiguous, as in the following example: I just looked at the room with the new computer and the modem with the bad ICs. Is the modem in the same room as the computer? In the sample language, the answer is "yes", but in English the sentence is ambiguous. Does the computer also have bad ICs? In the sample language, only the modem has bad ICs, but in English it is not clear. In English, the sentence is doubly ambiguous, not only because attachment on the right is ambiguous, but also because we're not sure where the coordinated structure begins. Does it begin with "the room" or does it begin with "the new computer"? Now, the sample language is not ambiguous. Without "saksika", only the modem has bad ICs. With "saksika", both the computer and the modem have bad ICs. Also, in the sample language, there is no doubt that both the computer and modem are in the same room. How, though, can we indicate that they are NOT in the same room? Again, let's look at how English can do it: I just looked at both the room with the new computer and the modem with the bad ICs. Thus, we can achieve the desired effect by beginning the coordinated structure with the word "both". We can provide ourselves with the same capability in the sample language by creating a single particle that, when used, will mark the beginning of a coordinated structure. When it is not used, attachment will always be to the closest preceding constituent of the same type. In the sample language, we will use the particle "ceka" for this purpose. It's important to note that the coordination initiator "ceka" and the coordination terminator "saksika" will only be necessary when the default interpretation is not the desired one. And since most coordinated structures are relatively simple, these particles will probably not be needed very often. 23.5 PARENTHETICAL EXPRESSIONS AND QUOTING Parenthetical expressions which elaborate or exemplify a concept sometimes use conjunctions, but not always. Here are some examples in English: Some people, SUCH AS JOHN, BOB, AND MIKE, had to leave early. Many birds do not fly south for the winter (E.G. SPARROWS AND PIGEONS). The house needed certain repairs, such as TO THE ROOF AND TO THE CHIMNEY. The man who managed the finance department, BILL JOHNSON, also managed the marketing department. The single disadvantage (I.E. THE HIGHER COST) will probably kill the project. John Smith, WHO JUST FILED FOR BANKRUPTCY, recently moved to Texas. The actual form of such constructions will depend heavily on the syntax of the AL. I would suggest, though, that normal conjunctions be avoided. Instead, I would create pairs of start/stop particles (using macro forms). In the sample language, we will use the following: suka -> start particle for a parenthetical expression toka -> stop particle for a parenthetical expression The start particle will introduce a list of one or more items and the stop particle will terminate the list. [These particles would correspond to pauses used in speech, or commas and parentheses used in writing.] If a list has more than one item, then all the items must be constituents of the same type. For example, a list could contain several noun phrases OR several prepositional phrases, but a single list could not contain BOTH noun and prepositional phrases. Note that these particles can often be used in the same way that English uses quotes in writing or the words "quote" and "unquote" in speech. Here are some examples: The novel "Stranger in a Strange Land" was very good. I heard him speak the words "Rubba Dub Dub" at least three times. However, as currently defined, "suka" and "toka" do not delimit a complete, standalone argument - they simply exemplify an existing argument. Thus, in the above examples, "novel" and "word" are the actual arguments, and the items delimited by "suka" and "toka" only exemplify the actual arguments. In effect, the quoted material modifies the argument. There will be times, though, when we will need to create a standalone argument consisting only of quoted material. This material could even be in a different language. Here are some examples: "Stranger in a Strange Land" was very good. I heard him say "Rubba Dub Dub" at least three times. Note the difference with the earlier examples. In the earlier examples, the particles delimited something which modified an actual argument. But in the last two examples, the particles DEFINE an actual argument. Obviously, if the syntax of your language must be unambiguous and computer-tractable, you cannot use the same mechanism for both types of quoting. There are two possible solutions: 1. create a second set of particles or, 2. let "suka" and "toka" delimit a standalone argument and, when necessary, make IT the argument of what it is modifying or exemplifying. Creating a second set of particles is really overkill, so we will adopt the second approach. With this approach, we would make distinctions as follows: I heard him speak the ON-words suka Rubba Dub Dub toka at least three times. I heard him say suka Rubba Dub Dub toka at least three times. In the first sentence, "ON-words" is the open noun form of "words" (terminator = "-giu"), and the expression delimited by "suka" and "toka" is an argument of the open noun. In the second sentence, the expression delimited by "suka" and "toka" is a standalone argument of the verb "say". Finally, there is no reason why quoted material must itself be parseable. In fact, it could even be in a different language. The only requirement, of course, is that the terminating particle "toka" must not appear inside the quoted material. In the extreme rare case where "toka" itself needs to be quoted, some form of periphrasis should be used instead. 23.6 INCOMPLETE COORDINATION Sometimes we need to coordinate more than one item, but do not want to list all items that are applicable. In English, we use expressions such as "etcetera" and "and so on" to do this. In the sample language, we will use the particle "mika" for this purpose. In effect, it terminates a list of coordinated items, and indicates that the list is incomplete. It may also be used to terminate a list of items in a parenthetical expression in place of the stop particle, if the list is incomplete. 23.7 ANAPHORA OF COORDINATED STRUCTURES There will be times when an anaphor of a coordinated structure will be needed. Here are a few examples: a. The engineer and his assistent just left. THEY had to go to work. b. The windows broke and a wall fell in. IT was a terrible experience. In the sample language, a simple anaphor of a coordinated structure will be formed from the first morpheme of the first conjunction, plus the appropriate terminator. A compound anaphor will be formed from the first morpheme of the conjunction, plus the first morpheme of one of the coordinated items (preferably the first one), plus the appropriate terminator. For example, if the word for 'and' is "neka", then the simple anaphor meaning "THEY" in (a) will be "neha". If the word for "engineer" is "veyaneyada", then the compound anaphor for "THEY" in (a) will be "neveha". 23.8 CONDITIONAL CLAUSES When one event is conditional upon another, English normally links the events with an "if...then" construction, as in the following example: If the law is passed, (then) tax forms will be simpler. And, as we discussed earlier, conditional clauses can be implemented using conjunctions derived from the contingency/entailment relationship or by directly using the contingency/entailment verbs. There are times, though, when we would like to mark a clause as conditional without being forced to create an "if...then" structure. Consider the following two sentences: Tax forms WILL be simpler under the new law. Tax forms WOULD be simpler under the new law. The first implies that the law is or will be in force. The second implies that the law MAY be in force; i.e., that passage of the law is still hypothetical. So, how do we implement the concept represented by the English word "would"? It is clearly not aspectual or modal (at least not within the framework of this monograph). Actually, in light of our discussion in the previous section, the solution should be farily obvious. The word with the meaning of the English word "would" is simply the disjunct form of the contingency relationship. For example, if the root for the contingency relationship is "-pau-" (default class = P/F-s), then we can do the following (the actual forms of the items in parentheses will depend on the syntax of the AL): Verb P/F-s: (Tax forms are simpler) pausi (the new law passes) = Having simpler tax forms is contingent upon/depends on passage of the new law. Inverse F/P-s: (The new law passes) pauvisi (tax forms are simpler) = Passage of the new law entails/means simpler tax forms. Conjunction "pauka" = 'then': (The new law passes) pauka (tax forms will be simpler) = If the new law is passed, then tax forms will be simpler. [Note that the word meaning "if" cannot be used here.] Conjunction "pauvika" = 'if': (Tax forms will be simpler) pauvika (the new law passes) = Tax forms will be simpler if the new law passes. Disjunct "pauxisi" or "paufo": Pauxisi (tax forms are simpler with the new law) = Tax forms would be simpler with the new law. Pauxisi (tax forms were simpler with the new law) = Tax forms would have been simpler with the new law. Since "pauxisi/paufo" simply marks its argument as 'hypothetical', it can also be used to represent English "if = whether" in subordinate clauses and complements, as in the following example: I don't know paufo John will be coming = I don't know if/whether John will be coming. Note that "pauxisi/paufo" simply indicates that its argument is conditional on some unspecified condition, and that the argument is hypothetical. Thus, it makes no sense to apply a tense to "pauxisi/paufo" itself. The tense must be supplied by the argument of "pauxisi/paufo". In this respect, "pauxisi/paufo" itself behaves just like a modal or tense/aspect word. 24.0 COMPOUNDING AND INCORPORATION Compounds are single words or simple expressions that represent unique concepts, but which are formed by combining two or more root morphemes. There are three kinds of compounds: 1. Compounds which represent the sum of their components (i.e., both components are present): to test-fly = to test AND to fly also drop-kick, stir-fry 2. Compounds in which one root is the argument (core or oblique) of the other root: watchmaker = X makes watch (argument = object) also mousetrap, fly swatter, housecleaning, blood test(er) Compounds of this type can also be created using verbs that are derived from basic nouns: baby oil (= X 'oils' baby), dish towel (= X 'towels' dish), doghouse (= X 'houses' dog), towel rack (= X 'racks' towel), dancehall (= X 'halls' dance), water skis (= X 'skis' water), snowshoes (= X 'shoes' snow), etc. rescue team = team rescues Y (argument = subject) also team rescue, student association, fan club, manmade [Note that the grammatical voice of the verb meaning 'rescue' determines whether the interpretation is 'rescue team' or 'team rescue'.] college education = X educates Y in/at college (argument = oblique locative) also beach party, mountain warfare, barn dance, city life spring showers = it rains DURING spring battle fatigue, evening prayers, marital sex, night flight to towel dry = X dries Y using towel (argument = oblique instrument) also steam iron, to water cool, handwriting, windmill to backpeddle = X peddles backwards (argument = oblique method/manner) also to sidestep, freestanding, to dog-paddle, to bunny-hop And so on. Many more oblique relationships are possible. 3. Compounds in which BOTH roots are core arguments of an IMPLIED verb: bedsore = bed CAUSES sore also disease germ, storm damage, tear gas, birth pain [Note that the INVERSE sense of the verb "cause" is used for "disease germ" and "tear gas".] tax laws = laws BEING_FOCUSED_ON taxes also murder investigation, UFO sighting, food requirements houseboat = boat BEING-THE-SAME-AS house also dungheap, girl friend, infantry battalion, snowball [Note that this group could also be considered as the noun equivalent to verb compounds like "stir-fry" mentioned above, since both components are present.] penknife = knife BEING_SIMILAR_TO/RESEMBLING pen also handlebar mustache, birdbrain, mother church, hamhanded mountain village = village BEING_LOCATED_IN/AT mountains also pocket watch, doorstep, hill people, farm house, bedpan also inverses silver mine, goldfish pond, racetrack olive oil = oil BEING_A_DERIVATIVE_OF olives also solar energy, buffalo hide, wood pulp, cane sugar also inverses meat calf, milk cow, sugar cane, pulp wood toolbox = box HAVING tools also apple pie, bedroom, art museum, pea pod, salt marsh also inverses lemon peel, student power, door knob, windowpane And so on. There may be others that fall into this category. However, if there are, I doubt there are very many of them. Note that many compounds can appear in more than one category. For example, "tree nursery" can be derived from "X GROWS trees AT nursery" or the inverse of "trees BEING_LOCATED_IN nursery". The compound "towel rack" can be derived from "X places towel ON rack" or the inverse of "towel BEING_LOCATED_ON rack". It is important to keep this in mind, since it's possible that one version may be implemented more efficiently than another, even though they have essentially the same meanings. Also, some are more specific, and thus less useful, than others. [Incidentally, Mandarin Chinese has many compounds in which each component means essentially the SAME THING. However, since most Chinese morphemes have several meanings, using just one would be ambiguous. By using two with the same or close meanings, the result is a word whose meaning is the meaning that the two components have in common. In a properly designed AL, this type of compound is totally unnecessary.] 24.1 IMPLEMENTING COMPOUNDS Some languages implement compounds by simply juxtaposing complete words (e.g. English, Chinese, Indonesian, and Quechua). Unfortunately, this approach is useless if you want the resulting compounds to be semantically precise. (By "precise" I mean 'as precise as the inherent precision of the basic components will allow'.) For example, what is the relationship between "house" and "boat" in the word "houseboat"? What is the relationship between "house" and "maid" in the word "housemaid"? Obviously, the relationships are different. Another way to implement compounds is to use a combination of a head word and a morphologically correct modifier (e.g. English adjective-noun compounds "solar panel", "marital sex", "marine life", "academic transfer", etc.). English uses this approach occasionally, French uses it more often, while Russian and Arabic use it quite often. In general, a language is more likely to use this approach if it has a regular and productive way to convert words from one part-of-speech to another. However, while the semantics of this kind of construction is more precise than simple juxtaposition, it can still be ambiguous. In many languages, ambiguity is somewhat reduced by using linking morphemes such as English prepositions. Swahili uses this approach for all of its compounds, and French uses it for most (French examples: "salle à manger", "eau de toilette", "film en couleurs", etc.). English uses it occasionally, as in "son-in-law", "hand-to-hand", and "bed of nails". Note, though, that these linking words can be very vague and their use is often idiosyncratic. If we want the semantics of our compounds to be precise, then the semantics of the linkers must also be precise. With the above comments in mind, let's look again at each type of compound and ask ourselves the following questions: a. Do we already have a way to implement this type of compound? b. If not, what new technique should we create to do it? As I will show below, the answer to question "a" is always "yes", making question "b" unnecessary. Here goes... 1. Verb-verb compounds Compounds similar to English "stir-fry" seem to be quite rare among natural languages. The only languages I know of that use them frequently are Chinese and a few others that make extensive use of serial verb constructions. My recommendation is to implement verb-verb compounds ONLY if the syntax of your AL can unambiguously handle serial verb constructions. If not, then your AL should require the use of an explicit conjunction, as in "to stir AND fry" or "to drop THEN kick". In our sample language we have both options: we can use conjunctions, or we can create case tags and adverbs that perform the same semantic function as serial verbs. 2. Compounds in which one root is the argument of the other root We can often accomplish this in our sample language by 'opening up' the argument structure of nouns and adjectives derived from verbs. Here are two examples using words we've already created: duck student = "teyomigiu guasuda" = 'studier of ducks' where noun "teyomida" = 'student', and "guasuda" = 'duck'. [Remember, we open up the argument structure of a normally 'closed' noun by terminating the word with "-giu" instead of "-da".] duck student = "guasuno teyomida" = 'a student who is duck' [Here, there is no need to 'open up' the noun "student" to make the subject position available for use. Instead, we simply use the adjective version of the noun meaning 'duck'.] For compounds in which one component is an OBLIQUE argument of the other, there is more than one possible approach. For example, a compound like "mountain village" can be implemented as "village mepe mountains", which literally means 'village in the mountains'. (Incidentally, this is the way that most natural languages would form such a compound.) Another way is to use a verb form of one of the components. Here is a complete derivation: -ke- state root meaning 'high' -nai- noun classifier for natural location kenaida 'mountain' -xoya- state root meaning 'alive/living' -te- noun classifier for artificial location xoyateda 'town' -so- "smaller" diminutive CCM xoyatesoda 'village' A/P-d: kenaipusi = to 'mountain' something; i.e. to cause something to come together with 'mountain' P-s: kenaisesi = to be 'mountained' Thus, "kenaiseno xoyatesoda" = 'mountain village' All locative noun-noun compounds, including inverses such as "silver mine", can be implemented in this way. We can also create a vaguer compound by using the open noun version of the word for 'village'. The result is "xoyatesogiu kenaida", which can be paraphrased 'village of mountain' Finally, many compounds are really not necessary. For example, the English word "backpeddle" can be just as easily implemented as "to peddle backwards", where "backwards" is a basic adverb. 3. Compounds in which BOTH roots are core arguments of an IMPLIED verb Again, our sample language already has this capability. Here are a few examples using words we've already created: student teacher = "teyomino teyokoda" = 'teacher who is a student' where noun "teyokoda" = 'teacher', and noun "teyomida" = 'student'. duck reservoir = "guasuseno guateda" where P-s verb "guasusesi" = 'to be populated by ducks', and "guateda" = 'reservoir'. snow duck = "xumpifano guasuda" = 'duck which is snow' (cf. "snowman", "snowball", etc.) where "xumpifada" = mass noun 'snow', and "guasuda" = 'duck'. Note that most of the above are just adjective-noun compounds, where the basic relationship is not stated separately, but is the result of normal derivational rules. Our sample language can create many compounds this way, as is commonly done in languages such as French, Russian, and Arabic, but with true semantic precision. Now, let's create some compounds in which the relationship must be indicated by a separate word: hydrology book = book mabie guatiwada = 'book focused_on hydrology' where "guatiwada" = 'hydrology', "mabie" = generic P/F-s open adjective. [Note that we could also have used the open form of the P/F-s derivation of the word for 'book', with "guatiwada" as its argument.] penknife = knife lobie pen = 'knife being_similar_to pen' where "losi = lomasi" is the P/F-s verb 'to be similar to' [This compound can also be implemented as "knife-like pen", where "-like" is the root/MCM "-lo-".] silver mine = mine mevibie silver = 'mine being_the_location_of silver' where "mevibie" is derived from the inverse of the P/F-s verb meaning 'to be located in/at'. [Incidentally, "silver mine" can be rendered more efficiently using the same approaches that we used above for "mountain village"; i.e. "silvered mine" or "mine of silver". It can also be implemented as "mine containing silver", where "containing" is derived from the basic constituency relationship we discussed earlier. In fact, the MOST efficient as well as most general derivation would be the noun for 'mine' modified by the simple adjective version of 'silver'. This construction states that the 'mine' IS silver. However, there is no reason to insist that the 'mine' must consist ENTIRELY of silver in order to use this construction.] And so on. These compounds are similar to Swahili compounds and most French compounds, but are semantically precise. English often creates similar constructions, such as "blood-sucking mosquitos", "swamp-dwelling amphibians", "man-eating tigers", "house-cleaning lady", etc. In these, however, only the hyphenated part of the construction is usually classified as a compound. Thus, since ANY relationship can be expressed by a transitive verb, and since ANY transitive verb can be converted to an open adjective, there is no limit on the number of compounds that can be created with semantic precision. Finally, the approach we are using here allows us to create many useful compounds that, in a language like English, would be either ambiguous or even impossible to create. For example, the English compound "woman teacher" could mean 'woman who teaches', 'teacher of woman', 'teacher who focuses on women', 'one who teaches like a woman', etc. With the system proposed here, we can create more compounds, and their meanings are always obvious. This ability is especially important if you are creating an AL that will be used by people who have different native languages. For example, if we were to create compounds as in English (by the simple juxtaposition of two root morphemes) the results will often be interpreted differently by people of different linguistic backgrounds. Also, it's important to note that the methods we are using here were not developed for the purpose of creating compounds. In fact, we did not develop ANY special techniques in this section on compounding. All of the techniques that we used to create compounds are the basic derivational processes we've been using all along. This is highly beneficial because it FORCES the word designer to create and use root morphemes systematically, rather than idiosyncratically. For example, if the word designer were to borrow the English compounding system, then he would almost certainly want to borrow many of the English compounds themselves. The net result would be a clone of the English vocabulary, and speakers of other languages would often be puzzled, confused, and even frustrated by the choices made. However, by making full use of a rich derivational morphology, we can completley eliminate idiosyncracy, and rules intended exclusively for compounding are simply not necessary. It will also make syntactic parsing MUCH simpler, and, if the syntax is designed properly, syntactic ambiguity can also be completely eliminated. 24.2 COMPOUNDS FROM OTHER DERIVATIONAL MORPHEMES As we've already seen, derivation of basic nouns is actually very similar to compounding. For example, when we derived nouns from the root meaning 'water', we paraphrased them using expressions such as "water bug" and "water energy". In fact, because of the nature of the classificational system we are using here, all basic nouns are pseudo-compounds in which the classifier plays the role of a semantically precise 'component' or 'headword'. To make this approach even more flexible, we can extend it by using both verb AND noun classifiers in the same word. For example, we can combine the verb "teyomisi" meaning 'to study' with the 'time' noun classifier "-be-" to create the word "teyomibeda" meaning 'study period'. Since natural languages contain many more nouns than verbs, we may want to increase the productivity of our morphology by creating even more noun classes. For example, we could add several new animal categories. We could split the plant categories into more precise sub-categories. We could divide up the various artifact categories based on how the objects are created or used, such as 'vehicles', 'tools', 'works of art', etc. CCMs and MCMs can also be used to create words that, in other languages, are often implemented as compounds. We've already seen many of these. Here are some examples: xumpifagida -> snowflake (count CCM "-gi-") xumpijigeda -> snowstorm ("high" scalar MCM "-ge-") xumpijisoda -> snow flurries ("low" scalar MCM "-so-") teyomidejazmida -> subject matter (middle CCM "-de-", and mass CCM "-jazmi-") teyokoveda -> teaching ability (quality/ability CCM "-ve-") tencipiapada -> intellectual growth (process CCM "-pa-") mesuabosi -> to get together (reciprocal CCM "-bo-") In fact, it would be very useful to create several more classifiers that represent concepts that are commonly used in compounding. We could call these morphemes _compounding classifiers_. Here are some concepts that we might want to implement as compounding classifiers, along with the actual morphemes that we will use in the sample language: room -kai- eg. bedroom, kitchen, concert hall building/residence -moi- eg. doghouse, boathouse, museum, temple shop/business -xempi- eg. butcher shop, bakery, bookstore measurement/ detection device -mesko- eg. thermometer, scale, microphone tool/implement -ca- eg. screwdriver, hammer, fork, scissors vehicle -gau- eg. bicycle, automobile, rickshaw, boat As an example, when we discussed how to change basic verbs into nouns and adjectives, we briefly mentioned that verbs with "instrumental" subjects could be quite useful. Now that we have the tool/implement classifier, we can elaborate: from A/P/F-d "teyokosi" = 'to teach', we can derive: "teyokocada" = teaching materials "teyokocano" = pedagogical/educational from A/P/F-p "teyoniosi" = 'to instruct', we can derive: "teyoniocano" = instructive from AP/F-s "teyofisi" = 'to review' "teyoficada" = review materials from AP/F-d "teyosuasi" = 'to self-teach' "teyosuacada" = self-study materials from AP/F-p "teyomisi" = 'to study' "teyomicada" = study materials "teyomicano" = heuristic Words meaning 'room', 'building', 'place of business', etc. can be created by simply using the classifiers as roots. For example, the word meaning 'room' is "kaida" and the word meaning 'vehicle' is "gauda". In summary, I don't feel that it's necessary to add any new morpho-lexical features to the sample language to handle compounds. Any compound that is needed can be created easily and precisely using the existing derivational techniques. 24.3 MNEMONIC DERIVATIONS Some compounds are not semantically precise, but actually refer to a subset of entities within a class. In other words, the compound actually describes more entities than it is intended to represent. For example, we might be tempted to create the adjective+noun compound with the literal meaning 'black bear' to represent the species 'Black bear'. However, this would be incorrect, since 'black bear' can apply to any bears that are black in color, even those that are not members of the species 'Black bear'. Because of this, a normal compound cannot be used. What we need is a way to make a distinction between normal, semantically precise compounds and mnemonic compounds. In the sample language, we will accomplish this by changing the noun terminator from "-da" to "-dawe", the adjective terminator from "-no" to "-nowe", the verb terminator from "-si" to "-siwe", and the adverb terminator from "-pe" to "-pewe" for derivations that refer to distinct concepts that are over-described by normal derivation. The new terminator will be used on the headword of a normally formed compound. For example, if the word for 'bear' is "hayumoda", and the root for "black" is "-xava-" (default class = P-s), then the words "xavano hayumoda" can be applied to any bear that is black in color, while the mnemonic compound "xavano hayumodawe" will refer only to members of the species 'Black bear'. With this approach, we are providing ourselves with the ability to use normal compounding techniques where we feel that a simple basic noun is inappropriate. [Note that mnemonic derivations do not have to be limited to compounds - they can also be used for single words - although I can't think of any examples at the moment. Also, keep in mind that a mnemonic compound such as "xavano hayumodawe" is only likely to be used once. After it is introduced, the compound anaphor "hayuxaha" can be used for the remainder of the discourse.] Later, I will discuss a consistent and objective approach for naming species that uses both normal basic nouns and mnemonic compounds. 25.0 TOPICALIZATION We've already discussed two ways in which an argument of a verb can be 'topicalized' or made more salient than other arguments. In this section, I will discuss and summarize all of the various degrees of topicalization that an AL will need. Topical constructions add emphasis and sometimes contrast over and above the normal topicalization indicated by argument structure. In natural language, there are basically four degrees of topicalization: 1. Normal topicalization. Topicalization is indicated by the basic argument structure of the verb; i.e. a subject is more topical than an object or an oblique argument. In some languages, especially those with an anti-passive construction, objects are more topical than oblique arguments. (English does not seem to make a distinction in topicality between objects and obliques. This view is supported by the fact that so many English verbs are inherently anti-passive but do not have active counterparts with clear differences in topicality; e.g. "to listen to", "to talk to", "to look at/for/up", "to wink/shout/laugh at", "to complain", etc.) 2. Contrasting topicalization. Topicalization provides both emphasis and contrast. Here are some English examples: It's John who killed the chicken. It's the chicken that John killed. It's killing that John did to the chicken. A chicken is what John killed OR What John killed is a chicken. 3. Heavy topicalization. An argument of the verb is made more topical than the subject. Here are some English examples: Bill, I saw him yesterday. The new amusement park, it opens for business today. On Sunday, I plan to relax all day. With his new suit, he can attend the conference without embarrassment. 4. Reference-switching. A new entity is introduced into the conversation and singled out for special attention. Here are some examples: As for the chair, John broke it. As regards John, he left in disgust. As far as the meeting is concerned, I decided not to attend. With regard to the delays, I assure you they won't happen again. Normal topicalization is an inherent part of the verbal derivational system that we are discussing in this monograph. This system is not only perfectly regular, but it allows us to create four sub-degrees of topicality (subject vs. object vs. expressable oblique vs. inexpressible oblique). In contrast, most European languages provide only two or three sub-degrees, while typically displaying a considerable amount of idiosyncracy. The second kind of topicalization is used to add both emphasis and contrast to an argument of a verb. English is somewhat unusual among the world's languages in implementing this function using cleft sentences. Most languages achieve this function by somehow marking the item with an inflection or particle and leaving the item in its normal position in the sentence. In our sample language, this emphasis/contrast function is performed in the more typical way; i.e. by using modal particles (or their derivatives), as we discussed earlier. The third type of topicalization, heavy topicalization, focuses the listener's attention on a particular argument of the verb. In effect, it makes the argument even more topical than a normal subject. Most natural languages, including English, accomplish heavy topicalization by a process called _left dislocation_; i.e. by moving the emphasized argument out of the sentence and placing it before the sentence. In addition, an anaphor of the moved item normally appears in the original position in the sentence if the moved item is a core argument of the verb. Thus, in English: The Smiths, THEY left early. Here, "the Smiths" is left-dislocated and the anaphor "they" takes its place in the sentence. In addition to the dislocation, languages mark the emphasized item either by an explicit marker, such as a particle, by a change in stress and timing, or both. Left-dislocation seems to be the way that most natural languages implement heavy topicalization. In fact, I have not been able to find a single example of a language that implements this function differently. Also, in most (if not all) languages, an anaphor of the dislocated item occupies the original position in the sentence if the dislocated item is a core argument. Thus, I suggest that the same approach be used in an AL. In our sample language, I will reserve the particle "pika" for this purpose. Here are some examples: Pika guasuda, the sailors ate guaha. = The duck, the sailors ate it. Pika on Sunday, I plan to relax all day. = On Sunday, I plan to relax all day. And so on. Note that we could have used the deictic "sestuda" (meaning 'it' or 'they') instead of the anaphor "guaha". However, as we discussed earlier, the use of deictics can result in ambiguity if the referent of the deictic is not obvious from context. The fourth kind of topicalization, referent-switching, introduces or re- introduces an entity into the conversation, and singles it out for special attention. This is also normally implemented as a type of left-dislocation, since the argument is moved to the left of the sentence and the gap in the main sentence is almost always filled with an anaphor of the moved argument. In English, this is usually accomplished with phrases such as "as for", "with regards to", "as far as X is concerned", etc. In our sample language, I will use the particle "boka" for this purpose: Boka John, I think the boss is going to fire him. = As far as John is concerned, I think the boss is going to fire him. Boka the new employee, I think he'll do very well. = As for the new employee, I think he'll do very well. Note that both "pika" and "boka" require that a complete sentence immediately follow their argument. 26.0 PROPER NOUNS AND VOCATIVES Proper nouns are the names of individual people, places, and things. However, what is considered a proper noun can differ from language to language. Here is the precise definition that we will use for the sample language: A proper noun is a word that is used to refer to one or more specific, unique representatives of a more general category designated by a basic noun. Thus, using the above definition, words such as "Atlantic", "Johnson", "IBM", "Christianity", "Caucasian", "1996", and "USA" are all proper nouns. They are unique instances of, respectively, the following common nouns: "ocean", "person", "corporation", "religion", "race", "year", and "nation". There's more than one way to deal with proper nouns, but I feel that the best approach is to reserve several terminators for proper nouns and their verb, adjective, and adverb forms. Roots, classifiers, CCMs, and MCMs will NOT have semantic significance, but they may be used for their mnemonic value. In addition, consonants and vowels that are not part of the sample language may be used in proper nouns, and the rules limiting the forms of vowel and consonant clusters can be ignored. Here are the terminators we will use in the sample language: proper noun -daya proper adjective -noya proper verb -sia proper adverb -peya Here are some examples: Boston - Bostodaya New York - Nuyokedaya [Note that we can not use "Nuyokadaya" because "ka" is an terminator.] John/Jonathan - Jonadaya Johnny - Jonacaudaya ("-cau-" is the 'informal' MCM) Richard - Ricadaya Louise - Luisadaya Michael - Mikodaya Anderson - Nandersodaya [In the sample language, a word cannot begin with a vowel.] Franklin Delano - Franklinoya Delanoya Roosevelt Rosaveltadaya [Note that we can not use "Delanonoya" because "no" is a terminator.] Boeing - Bowingadaya Democratic (party) - Demokratinoya France - Fransadaya Japan - Nipodaya Detroit - Detrodaya Obviously, the above approach does not allow further derivation. However, we can make one exception to our normal rules as follows: The non-noun forms of all proper nouns will be genitive. This should not cause any problems since there is rarely a need to make a distinction between alienable and inalienable possession using proper nouns. (Note that this is also the approach we took with anaphora.) Thus, the word "Niponoya" would mean 'Japanese', since it refers to anything that is associated with Japan. Similarly, "Niposia" means 'to be Japanese' and "Bostosia" means 'to be Bostonian'. The adverbial form will be class "0" by default, and will be especially useful with temporal and locative expressions (more below). A proper noun can be modified by adjectives to indicate titles. Here's an example: teyokoda - 'teacher' teyokogeda - 'professor' ("-ge-" = augmentative MCM) Nandersodaya - 'Anderson' teyokogeno Nandersodaya - 'Professor Anderson' Note that the above literally means 'Anderson who is a professor' or simply 'Anderson the professor'. In the same way, the adjective form of a modifying noun can be used to create a proper name with a more specific meaning. For example: kenaida - 'mountain' Verestedaya - 'Everest' kenaino Verestedaya - 'Mount Everest' Note that the above example literally means 'Everest the mountain'. To signify a looser connection between the common and proper nouns, we could use the open noun form of the common noun instead of the adjective form: kenaigiu Verestedaya - 'Mount Everest' Literally, the above example means 'the mountain of Everest' or 'the mountain associated with Everest'. However, I am not entirely happy with the second approach. It seems to me that the headword of a proper compound should itself be a proper noun. An expression such as 'the mountain of Everest' is not a true proper compound because it could conceivably apply to more than one mountain. Note that either approach can be used to name items such as "The Eiffel Tower", "The Sea of Japan", "Ockham's Razor", etc. They can also be used for proper compounds in which neither component is a proper noun, such as "The White House", "The Grand Prix", and "The Liberty Bell". For example, "The Liberty Bell" could be implemented as "Liberty the bell", where "bell" is simply the adjective form of the common noun meaning 'bell', and "Liberty" is the noun meaning 'liberty' terminated by "-daya" instead of "-da". Conventions can also be adopted that apply to proper nouns that come in groups. For example, days of the week can all have the form "DeXXXdaya", where the sub-string XXX is a numeric morpheme: Defedaya - Monday ("-fe-" = numeric 'one') Dedudaya - Tuesday ("-du-" = numeric 'two') Dezidaya - Wednesday ("-zi-" = numeric 'three') And so on. Conventions can also be adopted for months of the year, the years themselves (e.g. "1996"), letters of the alphabet, stellar constellations, etc. The adverbial forms will be most useful; e.g. "Dezipeya" = '(on) Wednesday', "Bostopeya" = 'in Boston', etc. A special part-of-speech terminator can be used for vocatives. In the sample language, I will use "-vu" for common vocatives and "-vuya" for proper vocatives: Ricavuya, come here! = Richard, come here! Teyokogevu, may I have a word with you? = Professor, may I have a word with you? Teyokogeno Nandersovuya, I won't be able to attend the seminar. = Professor Anderson, I won't be able to attend the seminar. Vu, where are you going? = Say there, where are you going? Note that a vocative is like a noun because it can be modified by an adjective, an open adjective, or even a relative clause. It functions, however, as a complete, stand-alone sentence. 27.0 CHOOSING PRIMITIVES: VOCABULARY DESIGN STRATEGY In this section, I would like to discuss a strategy that you can use to design the vocabulary of an AL. This strategy will not contain specific rules or procedures, but instead, is aimed at providing an overall philosophy or set of guidelines. (I will discuss a more specific methodology in the next section.) 27.1 DESIGNING BASIC VERBS Very early in this monograph, we decomposed the verb meaning 'to know' into a root concept and an argument structure. We then applied all other possible argument structures to the same root. This process resulted in many unexpected and extremely useful derivations. The number of useful derivations increased even more as we applied CCMs and MCMs. What I find most gratifying about the process is that many of the derivations are truly surprising. For example, while I felt that the concepts underlying the words "forgive" and "apologize" were related, I would not have expected to be able to derive both words from the same root. With the above in mind, I have defined four principles of verb design: 1. Start with simple, common verbs. Isolate their root concepts and apply ALL classifiers. Appropriate CCMs should be used when related verbs have different argument structures (e.g. "to say" vs. "to tell") In the process, the vast majority of less common verbs will be automatically derived. This principle also applies to numeric, deictic, tense/aspect, and modal concepts. 2. Keep in mind the inherent difference between basic state concepts and modal concepts. Always test new concepts to determine if they are modal. 3. Postpone derivation of actions until all state verbs have been created. Many action verbs are actually "-p" derivations of state concepts. 4. If you have difficulty defining a basic state or modality, or if it has limited usefulness when combined with most classifiers, CCMs, and MCMs, it is very likely that the state is not very basic. 5. Always be suspicious of roots that represent energetic states. Many of these concepts can actually be derived from non- energetic states that end up being much more productive. The fourth principle is the most difficult to apply, since the nature of the more basic state or modality may not be obvious. In a situation like this, postpone derivation of the particular verb or modal. There's a good chance that the desired word will be derivable from a different root concept that you haven't yet defined. Another tactic is to examine words that have similar meanings (a thesaurus can be very useful for this), or to create a few paraphrases of a sentence that uses the word. For example, how do we deal with the verb "to establish"? He established his innocence. He proved his innocence. He convinced others of his innocence. "He" = agent "others" = patient "his innocence" = focus Thus, "to establish/prove" is simply the A/F-d [+P] (i.e. anti-passive) derivation of the A/P/F-d verb meaning 'to convince (of/that)'. And, as we saw earlier, the verb "to convince" is actually derived from the probability modality. Thus, this sense of the English word meaning 'to establish/prove' is represented by the word "pintekogasi" in the sample language. Incidentally, by now it shouldn't be too surprising that obscure grammatical voice operations such as anti-passive, inverse, obviative, etc. can produce so many useful words. Languages that do not have these voice operations must instead use unique root morphemes, periphrasis, metaphors, or even idioms. Because of this, it is important to constantly keep these 'obscure' derivations in mind, especially when you run into difficulties. There are many hidden surprises in such a powerful derivational system as the one proposed here. As an illustration of the fifth principle, consider the word meaning 'to search for' or 'to look for'. If we derive this verb from the complex, energetic state meaning 'searching', then almost none of the other derivations will be very useful. In a situation like this, it's often helpful to create a short dialogue that uses the word in a realistic context. The dialogue may contain other words which can be derived from the required root which, in turn, could provide clues about the nature of the state we're looking for. Here's a sample dialogue: John LOST his watch. It's been MISSING for three days. He said he'll keep SEARCHING until he FINDS it. If we can define a state concept for the P/F-d verb "find", we can then derive the AP/F-p action counterpart meaning 'agent attempts to find focus' = 'to search for'. But what is the state concept for the verb "find"? It appears to be the state meaning 'knowledge of a location'; i.e., to 'find' something means to determine its location. Earlier, in the section on counts and measures, we created the CCM "-vie-", which converted a state to a concept meaning 'to measure or determine the state'. This, of course, is exactly what we need here. The verb meaning 'to look for' can be paraphrased as 'to attempt to determine the location of'. And the state root meaning 'location' is "-me-", the same root we used to derive several locative case tags and verbs. Here are some useful derivations using this root and the CCM "-vie-" (default class = AP/F-d): AP/F-d: "meviesi" 'to (seek and) find', 'to locate', 'to determine the location of' AP/F-p: "meviemisi" 'to look for', 'to search for', 'to seek' P/F-d: "meviedosi" 'to happen upon', 'to discover', 'to stumble across', 'to accidentally find' P/F-s: "meviemasi" 'to know the location of' F-p [+AP] adj: "mevieminuno" 'sought after' P/F-d: "menaviedosi" 'to lose', 'to lose track of' Thus, it's important to be especially careful with words which seem to imply energetic states, but in which the agent tries to obtain (or successfully obtains) a clearly defined goal or end point. In fact, without such a goal, the act would be useless. True energetic states, such as "to jog", "to sing", "to play", "to swim", "to twinkle", and so on do not incorporate a pre-defined goal into their meaning. Instead, the activity itself is useful, desirable, or natural. 27.2 DESIGNING BASIC NOUNS In natural languages, basic nouns far outnumber basic verbs. I suspect that this is so because Mother Nature and human ingenuity have provided us with many unique 'things', and humans have created many unique names for them. However, we seem to describe the WAYS these 'things' interact using a much smaller vocabulary. Fortunately, the system proposed here is eminently qualified to deal with this difference in relative numbers, because it inherently allows us to create more basic nouns from a particular root than basic verbs. This is true for two reasons: First, the design of basic nouns is more flexible because the roots are used for their mnemonic value (which can be vague or even metaphoric) rather than for their semantically precise meanings. Second, as AL designers, we are free to create as many noun classes as we feel are necessary, which will allow us to derive even more basic nouns from a particular root. What this implies to me is that our top priority should always be to derive most (if not all) basic verbs first. Derivation of nouns should be postponed until later. However, since some nouns may be needed for testing, they should be derived tentatively, and only for the most obvious and harmonious combinations of root-plus-classifier. Once you have compiled a comprehensive list of state/action/modal concepts, you can then start matching them up with appropriate real-world entities. 27.3 SINGLE WORDS OR COMPOUNDS? When designing your vocabulary, you will often have to ask yourself whether a concept should be implemented as a single word or as a compound. Natural languages differ considerably in this respect. For example, English has unique unrelated words meaning 'mouse' and 'rat', while Japanese does not. On the other hand, Swahili has unique, unrelated words for 'soldier ant', 'white ant', and 'brown ant', whereas English forms compounds. Obviously, the word designer will be heavily influenced by his native language, and may unintentionally copy it. In order to avoid this inherent kind of bias, I suggest the following guidelines for living noun classes: For the living noun classes, a single word should be created for each biological category (phylum, order, class, family, or genus) that is linguistically useful; i.e., which is likely to have a single-word representation in a natural language. A single word may also be used to represent a super-category consisting of more than one category, if the categories are similar enough, and if a natural language is unlikely to differentiate between them. For sub-categories (such as individual species) within a category or super-category, a descriptive mnemonic compound should be created. For extremely common sub-categories, a unique common noun can be created. To illustrate the first guideline, consider the following chart: Common name Family Genus & species ------------------------------------------------------- Arctic fox Canidae Alorex lagopus Bat-eared fox Canidae Otocyon megalotis Bushdog Canidae Speothos venaticus Cape hunting dog Canidae Lycaon pictus Coyote Canidae Canis latrans Crab-eating fox Canidae Cerdocyon Thous Dingo Canidae Canis familiaris dingo Dog Canidae Canis familiaris Grey or Timber wolf Canidae Canis lupis Raccoon dog Canidae Nyctereutes procyonoides Red fox Canidae Vulpes vulpes As you can see, there is very little consistency in the English names. Using the above guidelines, we would allocate a single word for all members of family Canidae. In the sample language, we will use the action root "-bawa-", meaning to 'bark'. Thus, the basic noun "bawamoda" would refer to any canine, such as 'dog', 'fox', or 'wolf', and the adjective "bawamono" would be equivalent to the English adjective meaning 'canine'. Now, if the proper noun for 'Arctic' is "Nartikidaya", we could create the mnemonic compound "Nartikinoya bawamodawe" for 'Arctic fox'. If the root meaning 'grey' is '-muzge-' (default class = P-s), then the compound "muzgeno bawamodawe" would mean 'Grey wolf'. (Note that this is the same approach we used earlier to derive the mnemonic compound meaning 'Black bear'.) For canis familiaris, we need to allocate a unique common noun. In the sample language, we will use the state root "-bue-" to represent the relationship meaning 'friend' (default class = P/F-s). Thus, the full name for 'dog' is "bueno bawamodawe", and the simple common noun is "buemoda". We could also create a macro that includes more than one species. For example, we could create a single macro to represent all species that we think are 'wolf-like', and which would correspond in meaning to the English word "wolf". The problem, though, is that this is inherently arbitrary and you will NEVER find agreement among all natural languages on how such divisions should be made. And if you do it for English words such as "wolf" and "fox", then, in all fairness, you must accept the impossible task of doing it for all other natural languages as well. For the non-living noun classes, I suggest the following approach: 1. If a combination of verb root plus noun classifier is highly suggestive or mnemonic, then use it. 2. Otherwise, if a concept can be implemented by exactly two simpler words, then use the two-word compound, even if the result is slightly too general. For extremely common concepts, a unique common noun can be created. 3. Otherwise, a single word should be created to represent the concept. Using the above approach, words such as "breadbox", "bookshelf", and "desk" (i.e. 'writing table') will be implemented as compounds, while words such as "window", "computer", and "island" will be implemented as single words. Note that approaches (2) and (3) can also be applied to verbs. By allowing compounds that are slightly more general in meaning than their English counterparts (e.g. 'writing table'), the results are more likely to encompass the meanings of equivalent words in other natural languages. Finally, keep in mind that, in normal usage, a long name such as "Nartikinoya bawamodawe", meaning 'Arctic fox', will be used only once in the text or discourse to introduce the name. From that point on, the compound anaphor "banaha" would be used. 28.0 WORD DESIGN PROCEDURE Throughout this monograph, we have been using paraphrases to define the meanings of particular derivations. These paraphrases are very much like dictionary definitions, although more primitive. Also, there is nothing arbitrary about the choice of words used in each paraphrase - a paraphrase can always be unambiguously generated from the meanings of the component morphemes. This ability to generate paraphrases unambiguously means that the paraphrases can be automatically generated by a computer, which can greatly speed up the word design process. Of course, the word designer will still have to choose the root concepts using the guidelines that we discussed in the previous section. Once a root concept has been chosen, though, a properly programmed computer can then automatically generate paraphrases of the derivations that use the root concept. For example, if we define the following root concept: State paraphrase: 'liquid' Noun paraphrase: 'water' then a computer can automatically generate the following questions: A/P-s: agent maintains patient in a 'liquid' state = ? A/P-d: agent causes patient to become 'liquid' = ? P-d: patient becomes 'liquid' = ? ... Mammal: 'water' mammal = ? ... Energy: 'water' energy = ? And so on. Appropriate paraphrases can be programmed for all classes. If a paraphrase has an equivalent word or fixed expression in the natural language of the AL designer, then the designer can provide the word or expression to the computer, which will automatically create the dictionary. Paraphrases of derivations using modifying morphemes can use the native word instead of continually repeating the more verbose paraphrase. For example, the following would be the paraphrase for the English word "learn": P/F-d: to become more 'knowledgeable' about focus = ? Once the AL designer has provided the word "learn", further derivations can use this word instead of the more verbose paraphrase. Thus, the verb meaning 'to master' would have the following paraphrase: maximum augmentative: to 'learn' to the greatest degree possible = ? instead of the more verbose: maximum augmentative: to become more 'knowledgeable' about focus to the greatest degree possible In general, specific root morphemes should NOT be chosen at this stage of the design. In other words, the AL designer should not create any actual words, but should postpone the selection of actual root morphemes until later. After a large number of words have been defined, statistical tests of the results can be performed to determine the distribution of number-of-roots versus number-of- productive-derivations. This will allow the designer, ultimately, to assign the shortest roots to the most productive root concepts. 28.1 SAMPLE DERIVATION Just for fun, here's a fairly large (but incomplete) set of derivations using the root "ja-", the speech morpheme we discussed earlier. As a root, it represents the speech act meaning 'say/tell/speak' (default class = A/P/F-p). Here goes... Basic Verbs: jasi to tell (eg. I told John a joke.) jagasi to say, utter, express (eg. He said one word to me. "-ga-" = anti-passive CCM) jaxisi to express, give expression to ("-xi-" = anti-middle) jakuasi to speak (eg. He spoke to her about the meeting. "-kua-" = double anti-passive) jamiusi to have a talk with ("-miu-" = anti-anti-passive) jabosi to discuss, to confer about ("-bo-" = reciprocal) jatasi to ask Basic Nouns: jamoda man, human ("-mo-" = mammal) jasuda parrot ("-su-" = bird) japustada dragon ("-pusta-" = reptile) japoda ent, Tolkien's talking tree people ("-po-" = tree) jafiuda speech synthesizer, vocoder(?) ("-fiu-" = non-living, artificial, matter & energy) javauda vocal chords ("-vau-" = living matter) jateda stage ("-te-" = artificial location) jamoida theater ("-moi-" = building/residence) jaboteda forum jakida electric speaker ("-ki-" = artificial item) japaida speech sound/energy ("-pai-" = non-living energy) jacada megaphone ("-ca-" = tool/implement) jameskoda microphone ("-mesko-" = measurement/detection device) jabiuda language/speech community ("-biu-" = abstract group) jaxoda language ("-xo-" = performance) jaloxoda dialect ("-lo-" = similar/like) jatiwada linguistics ("-tiwa-" = field or profession) janeyada linguist ("-neya-" = member of a profession) jabolida dialogue, conversation, discussion ("-li-" = performance component/result) jalida utterance, speech act jajulida phoneme ("-ju-" = 'minimal' MCM) jasolida morpheme ("-so-" = 'not too' MCM) jagelida word ("-ge-" = 'very' MCM) japilida sentence ("-pi-" = 'maximal' MCM) japisenjelida paragraph ("-senje-" = group CCM) [Note that "-senje-" occurs BEFORE the noun classifier, and is thus being used only for its mnemonic value. The alternative "japilisenjeda" means literally 'group of sentences', which is not necessarily a paragraph.] Jabodaya the name of the sample langauge ("-daya" = proper noun terminator) Miscellaneous: jasifnesi to reply, to respond ("-sifne-" = 'back/in return') jakuada statement ("-kua-" = double anti-passive) jasifnekuada reply, response jatakuada question, query japino talkative, garrulous japiloida chatterbox, big mouth ("-loi-" = rude/insulting MCM) jabocausi to shoot the breeze about ("-cau-" = informal MCM) jaboloisi to shoot off one's face about jageliniogasi to verbalize, to put into words, to express in words jabolicauda bull session jaguisi to blurt out (to), to tell/say accidentally ("-gui-" = P/F-p verb classifier) And I'm sure there are many others. Keep in mind that whatever appears before a noun classifier does not have to be semantically precise - it can be used for its mnemonic value. Also, note that since "ja-" is a speech act, non-agentive verb derivations are not very useful. 29.0 USING WORDS: LITERALNESS, POLYSEMY, METAPHOR, AND IDIOM Throughout this monograph, we've seen many examples of derivations whose English counterparts were periphrastic, polysemic, metaphoric, or even idiomatic. In fact, when speakers of natural languages use non-literal language it is almost always because they are forced to do so. They cannot avoid it either because their vocabulary does not have an appropriate literal construction available, or because it is something that the speaker is not comfortable using. This is unfortunate because the way that a non-literal construction will be interpreted will depend very much on the native language and culture of the listener. For example, metaphoric use of the word "pig" can have meanings such as "slob", "sex maniac", or "over-eater" in English, but will have different meanings to speakers of other languages. Also, as we've seen many times throughout this monograph, many metaphors, including the above examples, can be avoided by using appropriate derivations instead. For example, pejorative morphemes can be used to implement the above examples. In fact, I have become completely convinced that a properly derived word can replace ANY required or unavoidable metaphor, and it can never be misinterpreted by native speakers of other languages. Thus, the goal of an AL designer should be to provide the means to say ANYTHING without the need for non-literal language. In other words, metaphor, polysemy, and idiom should be optional - they should NEVER be obligatory. It is also my opinion that non-literal language should be generally avoided (except where its use is obvious to all listeners or readers), since the possibility for misunderstanding is so great. [Since this monograph is about the use of semantic precision in word design, it is not the proper place to discuss non-literal language, and I will say no more about it here. However, if you would like to read more about the dangers of metaphor, see my separate essay entitled "Metaphor".] 30.0 FINAL WORD ON FOCUS At the very beginning of this monograph, I stated that the focus case role is vague and even somewhat "out-of-focus". Furthermore, even our working definition of focus is vague: that the focus is the referent of an actual or potential relationship with the patient. Actually, I don't really think that the above definition is needed, even though I DO believe that it is accurate. In fact, we can come up with a different and perhaps better definition if we look at our primary case roles as sets of binary features, an approach which is often quite useful in linguistics. There are only two features, agent and patient, and there are only four possible combinations: A case role -> +agent, -patient P case role -> -agent, +patient AP case role -> +agent, +patient F case role -> -agent, -patient In other words, the focus case role is the primary case role that is neither agent nor patient. Thus, focus is indeed vague, but it is definitely not ambiguous. Also, even though we have derived many verbs that do not have a focus as part of their argument structure, the simple fact is that ALL verbs are focused to some degree. When a focus is not explicit, it is either incorporated into the meaning of the verb or is too vague, general, or variable to require an explicit form. Because of this, we are able to de-focus words that, on first examination, seem to be inherently focused, such as locatives (e.g. "to stay put" vs. "to stay at"), temporals (e.g. adverb "earlier" vs. case tag "before"), and so on. In each case there is a "default" focus that we all seem to understand intuitively. Sometimes the default focus is the time of the utterance (e.g. the temporal adverbs) or the initial location of the patient (e.g. the locative adverbs), simply because no other interpretation makes sense. At other times the meaning seems to be totally idiosyncratic, and can include any or all possible foci. Unfortunately, aside from some vague and elusive ideas, I do not know how to describe the semantics of these default foci, and I'm not sure it's even possible. At this point in time, I can only provide the following guidelines: If it is truly possible for an unfocused verb to have more than one referent in the provided context, then it DOES have any or all of them, and any or all of these possible referents IS the default focus (e.g. "The country is rich" vs. "The country is rich in copper"). If it is NOT possible for an unfocused verb to have more than one referent in the provided context, then the default focus is the ONE possible referent that makes sense (e.g. "The old man left the house" vs. "The old man went away"). The above is not what I would call 'semantically precise', but it's the best I can do for now. 31.0 DEFAULT VERB CLASSES Several times throughout this monograph, I indicated that certain roots had default verb classes without explaining WHY I chose those particular defaults. In this section, I would like to summarize the basic verb types. In doing so, the reasons for the defaults will become clear. I've already stated that there are two basic verb types: state verbs and action verbs. State verbs are derived from patient-oriented concepts, while action verbs are derived from agent-oriented concepts. However, these two basic types are not sufficient to decide which defaults should be used. We need to sub- divide the basic verb types into more detailed categories. Here is a list of those categories and the default verb classes that I have chosen to use in the sample language: Action verbs: Physical acts: default = A/P-d e.g. to kick, to ram, to slap Speech acts: default = A/P/F-p e.g. to say, to curse, to congratulate Activities: default = AP-s e.g. to sing, to smoke, to swim State verbs: Mental states: default = P/F-s e.g. loving, knowing, fearing, wanting Relational states: default = P/F-s e.g. inside, after, possessing, meaning Scalar states: default = P-s e.g. blue, heavy, intelligent, smelly Numeric states: default = P-s e.g. five, seventh, many Deictic states: default = P-s e.g. this, now, you, here Binary states: default = A/P-d e.g. alive, closed, asleep, broken I chose the above defaults to reflect the argument structures that are most commonly used with the verbal subtypes. For example, activities such as "singing", "playing", and "dancing" are most commonly used as AP-s verbs. When focused, they elaborate the event; i.e. "sing a song", "dance a polka", etc. The focus of an action verb always elaborates the event; i.e., it provides more detail about what the agent is doing. The focus of a mental state elaborates what the mind is doing; i.e., it indicates what the mind is focused on. The focus of a relational state indicates the referent of the state. The focus of a scalar state indicates the actual position of the state on a scale of possibilities; i.e., it elaborates the magnitude of the state (or the change in magnitude for dynamic verbs). The focus of deictic and binary states is almost always incorporated into the verb, but it elaborates the state on the rare occasion when it is actually used. The focus of a numeric state is the larger set from which the more specific quantity is being selected. Note that all modal concepts are mental states, and temporal and locative concepts are relational states. A useful generalization is that the focus of ALL verbs provides greater context for the situation or event. More specifically, the focus of an action verb elaborates the event (i.e., it tells us more about what the agent does), while the focus of a state verb is the referent of the state (i.e., it tells us more about the state of the patient relative to the focus). With the above in mind, we can now present a chart that indicates the default verb classes for all root concepts: Physical acts: A/P-d Speech acts: A/P/F-p Activities: AP-s Register acts: A/P-p Mental states (including modals): P/F-s Relational states (including temporals and locatives): P/F-s Scalar states: P-s Numeric states: P-s Deictic states: P-s Binary states: A/P-d Basic nouns (including abstract nouns): P-s Generic "-ze-": A/P-s All other morphemes (including scalar polarity morphemes): "0" 32.0 SUMMARY: A COMPREHENSIVE LEXICO-SEMANTIC SYSTEM I hope that by now I have convinced the reader of the value of a powerful derivational system. I cannot emphasize too much that a system like the one that I'm proposing here will maximize the neutrality of the vocabulary of an AL, while almost completely eliminating the need for ad hoc and arbitrary word creation. It will also reduce to an absolute minimum the number of morphemes that a student of the language will have to memorize. One of the greatest difficulties in learning a new language is mastering the idiosyncracies of the vocabulary. This is so because a word in one language rarely means exactly the same thing as its closest counterpart in a different language. In other words, the "semantic space" of each word in a natural language is arbitrary - the result of centuries of evolution and accident. In effect, each word of a natural language has built-in irregularities that the student must learn. Unfortunately, most AL designers unwittingly clone their native vocabulary, not realizing the difficulty that will be faced by potential students of the AL. The net result is that the meaning of a word cannot be deduced from more basic and universal concepts that have the same meaning for everyone, but instead depends almost exclusively on its meaning in only one natural language - the native language of the AL designer. In such an AL, the semantic space of each word is arbitrary, and mastering the idiosyncracies of the entire vocabulary can take years of effort. Thus, different speakers WILL use the words differently, and misunderstandings WILL occur because there are no rules that can be followed to determine the precise semantic space of a word. Instead, each speaker will use the word in the same way he would use the closest equivalent in his native language. In the system proposed here, the semantic space of each word is precisely defined in terms of the much more basic meanings of the morphemes that make up each word. And while there may be some arbitrariness in the selection of the root concepts, the overall arbitrariness of the entire vocabulary will be much, much less. Thus, even though we may never be able to achieve true neutrality, we can certainly come very close. APPENDIX A: THE PHONOLOGY AND MORPHOLOGY OF THE SAMPLE LANGUAGE Word ::= { Morpheme } + Part-of-speech Morpheme ::= C D | C V { X } Part-of-speech ::= Terminator Terminator ::= bie | cu | da | dawe | daya | di | fo | giu | ha | he | hi | ho | je | jewe | jeya | ka | nia | no | nowe | noya | pe | pewe | peya | si | sia | siwe | tiu | vu | vue | vuya C = any consonant (p, b, t, d, k, g, c, j, l, m, n, f, v, s, z, x) D = any diphthong (ai, au, eu, ia, ie, io, iu, oi, ua, ue, ui, uo) V = any vowel (a, e, i, o, u) S = any semi-vowel (w, y) X = extension = S V | C C V | = logical 'or' {} = enclosed item may appear zero or more times Lower case letters represent themselves Pronounce vowels as in Italian, Swahili, or Japanese (i.e. the five cardinal vowels /a/, /e/, /i/, /o/, and /u/). Pronounce consonants as in English, except for the following: "c" is like "ch" in "church", "x" is like "sh" in "shop", and "q" is like "s" in "measure". [The consonant "q" is not very common among the world's languages, but I included it to maintain the balance between voiced and unvoiced consonants. I would NOT use it unless it becomes absolutely necessary.] The consonant "n" may be pronounced as a velar nasal (like "ng" in "sing") before "g" and "k". Pronounce 'y' as in 'royal' and 'w' as in 'awake'. A word with the above morphology can always be parsed unambiguously into its component morphemes, and a stream of words can always be divided unambiguously into individual words even if there are no spaces between words. Thus, the boundaries between morphemes and words is never in doubt. This feature of word morphology is usually referred to as either _self-segregation_ or _auto- isolation_. Because of the self-segregation rules, a morpheme cannot have the form of a terminator. For example, a root with the form "-da-" is illegal because "da" is a terminator. However, "-daye-", "-daspo-", and "-sunda-" ARE legal because the extensions "-ye-", "-spo-", and "-nda-" create unique morphemes. [Note that I do not consider a terminator to be a morpheme, even though it has the form of a morpheme. This may not be technically correct, but it is useful for our purposes.] There are limitations on which vowels may be juxtaposed in a vowel cluster and on which consonants may be juxtaposed in a consonant cluster. As a general rule, NO geminate phonemes are allowed. Thus, "dd", "aa", "nn", and so on can never occur in the language. Only the following vowel clusters are allowed: ai au eu ia ie io iu oi ua ue ui uo In the above clusters, the vowels "i" and "u" may be optionally pronounced like the semivowels "y" and "w", respectively (pronounce "ui" as /wi/, and "iu" as /yu/). Additional vowel nuclei can be formed by placing a semi-vowel between two vowels. For example, we cannot have "ea" but we can have "eya", and we cannot have "oa" but we can have "owa". However, "y" may never be followed by "i" and "w" may never be followed by "u". Thus, combinations such as "oyi" and "awu" are forbidden. If "y" is inserted after "e" or "i", or "w" is inserted after "o" or "u" in one of the allowable vowel clusters, then the result will be an allophone with exactly the same meaning. Thus, "iyu" is identical to "iu", "uwa" is identical to "ua", "eyu" is identical to "eu", and so on. In effect, a diphthong is simply a VSV in which the semi-vowel has been deleted; the semi-vowel 'w' is deleted if either V is 'u', and the semi-vowel 'y' is deleted if either V is 'i'. Thus, for example, "-io-" is actually "-iyo-", "-eu-" is actually "-ewu-", "-ui-" is actually "-uwi-", "-iu-" is actually "-iyu-", and so on. For consonant clusters, ALL combinations of exactly two consonants are allowed except as follows: 1. Only one of the consonants in a cluster may be a stop or an affricate. For example, "km", "nt", and "zg" are allowed while "kc", "pt", and "db" are not allowed. 2. Only one of the consonants in a cluster may be a fricative. For this test, affricates are considered to be fricatives. For example, "sk" and "dz" are allowed while "sc", "cx", and "zj" are not allowed. 3. Both consonants must have the same voicing unless one of them is "l", "m", or "n". Thus, "nc", "st", "px", and "gv" are allowed, but "sd", "bx", and "kv" are not allowed. 4. The clusters "tx" and "dq" are not allowed because they can be easily confused with the affricates "c" and "j". 5. The consonant "h" may never be the first consonant in a cluster, and may only appear after "l" or "n". Stress is not necessary for proper understanding. However, for the sake of consistency, it is recommended that stress be applied according to the following rules: 1. A syllable is defined such that any occurrence of CV or SV is the beginning of a new syllable. Thus, a syllable can have one of five forms: CV, CVV, CVC, SV, and SVC. Examples: da, fe/wa/da, ki/di, Po/da/ya, do/sau/pe, gam/bu/ma/yas/ti/no, can/do/ka, etc. Note that a syllable boundary may appear within a morpheme, but that the start of a morpheme is always the start of a syllable. 2. All words should be stressed on the next-to-last syllable (i.e. penultimate stress). 3. If a word contains more than four syllables, then also stress the second syllable. 4. Stress can be applied as greater volume, higher pitch, longer duration, or any combination thereof. Default classes of roots depend on their meaning as follows: Physical acts: A/P-d Speech acts: A/P/F-p Activities: AP-s Register acts: A/P-p Mental states (including modals): P/F-s Relational states (including temporals and locatives): P/F-s Scalar states: P-s Numeric states: P-s Deictic states: P-s Binary states: A/P-d Basic nouns (including abstract nouns): P-s Generic "-ze-": A/P-s All other morphemes (including scalar polarity morphemes): "0" Once a class is assigned, either explicitly or by default, only a classifier or class-changing morpheme can change it. For example, if a word consists of exactly two root morphemes and a terminator, then the first root will provide the class. Previous-word modifiers have the property of modifying an entire syntactic constituent when they are applied to the head word of a constituent. For example, when a previous-word modifier follows a noun, it will modify the entire noun phrase. APPENDIX B: MORPHEMES OF THE SAMPLE LANGUAGE This appendix contains a complete list of all of the morphemes that were created in this monograph, including terminators. Terminators: Verb: -si- Proper verb: -sia- Mnemonic verb: -siwe- Imperative: -cu- Adverb/case tag: -pe- Proper adverb: -peya- Mnemonic adv: -pewe- Noun: -da- Open noun: -giu- Proper noun: -daya- Mnemonic noun: -dawe- Adjective: -no- Open adjective: -bie- Proper adj: -noya- Mnemonic adj: -nowe- Previous-word modifier: -di- Open PWM: -nia- Vocative: -vu- Proper voc: -vuya- Mnemonic voc: -vue- Modal/tense/aspect/ disjunct: -fo- Particles: -ka- Anaphora: -ha- -he- -hi- -ho- Reserved: -je- -jewe- -jeya- -tiu- Verb Classifiers: A/P/F-s: -tue- A/P/F-d: -ko- A/P/F-p: -nio- A/P-s: -zoya- A/P-d: -pu- A/P-p: -ce- AP/F-s: -fi- AP/F-d: -sua- AP/F-p: -mi- AP-s: -panji- AP-d: -za- AP-p: -diu- P/F-s: -ma- P/F-d: -do- P/F-p: -gui- P-s: -se- P-d: -pia- P-p: -moncu- 0/A: -fia- 0/AP: -piu- 0/P: -gu- 0/F: -jo- 0: -la- Noun Classifiers: matter & energy: living, animals -nembi- vertebrates, mammals -mo- birds -su- reptiles -pusta- fish -sai- insects (all arthropods) -zio- living, plants -kaya- trees & shrubs -po- other spermatophytes -tonze- non-living, natural -ji- non-living, artificial -fiu- matter: living -vau- non-living, natural, substance -fa- locative -nai- other -le- non-living, artificial, substance -niu- locative -te- other -ki- energy: living -dengi- non-living -pai- time: -be- Abstract Nouns: -ta- measurements -biu- groups/organizations -xo- performances -li- performance components and results -tiwa- fields of endeavor -vo- field components (i.e. schools) and results (i.e. styles) -neya- member of a profession (= -tiwa-panji-) Compounding Classifiers: -kai- room -moi- building/residence -xempi- shop/business -mesko- measuring device -ca- tool/implement -gau- vehicle Class-Changing Morphemes: -na- not, other than -de- middle -xi- anti-middle -gue- anti-anti-middle -voi- double middle -ceu- double anti-middle -nu- passive -ga- anti-passive -miu- anti-anti-passive -jau- double passive -kua- double anti-passive -vi- inverse ( A/P/F-x -> P/A/F-x ) -viga- obviative ( P/F-x -> F-x [+P] allowing case tag to bind to focal subject ) -ne- cosubject ( demotes part of the subject and makes it obliquely expressable ) -sau- non-subject ( an entity is specifically excluded from being subject ) -gi- convert to count noun -jazmi- convert to mass noun -senje- convert to group noun -ve- essential quality or ability -pa- process -mante- process result or product -xa- genitive (result class = P-s) -tu- reflexive -bo- reciprocal ( A/P/F -> A=P/F ) -pasku- P=F reciprocal ( A/P/F -> A/P=F ) -fu- infinitive, same subject as outer verb -vua- make all arguments of a verb oblique -vie- determine/measure state (result class = AP/F-d) Scalar Polarity Roots/MCMs: -lau- too, excessively -pi- maximally, extremely -ge- very, highly -xe- ??? midpoint, average, so-so ??? -so- not too, not very -ju- minimally, barely, hardly -zunda- almost, not quite -pinte- definitely, absolutely (= 100% epistemic probability) -junte- not at all, not ... whatsoever (= 0% epistemic probability) -pusli- just, only -ku- interrogative Register Roots/MCMs: -xemna- fawning, groveling, subservient -tenko- humble -mio- polite, very formal -zai- formal -cau- informal, slang -loi- contemptuous, rude, insulting -pie- vulgar, filthy, tasteless -xua- macho -xesmi- effeminate Numeric roots/MCMs: A basic number will have the following format: ( radix ) + ( minus sign ) + [ digit ] + ( decimal point + [digit] ) + ( exponent + (minus sign) + [digit] ) + ( ordinality ) + part-of-speech Here are the number-forming morphemes: -heksi- hexadecimal radix (default = base ten) -minsu- minus sign (default = positive) -zeyo- zero -fe- one -du- two -zi- three -kau- four -poi- five -bua- six -vastu- seven -ketsa- eight -go- nine -dai- ten -senti- hundred -kio- thousand -milni- million -maya- A hex = 10 decimal -biwi- B hex = 11 decimal -cawa- C hex = 12 decimal -doyo- D hex = 13 decimal -neye- E hex = 14 decimal -fuyu- F hex = 15 decimal -divde- divider, X/Y -fevde- divider, 1/X -cuye- decimal point -jinta- exponent -hu- negative exponent -vevna- real/imaginary separator -kawa- N at a time, N per group, in groups of N -saksi- all, the whole amount -mai- many, much, a lot, a large amount -xandu- not too many, not too much -pewa- few, little, a small amount -zonja- any, positive non-zero, one or more, greater than zero Ordinality: -- cardinal (this is the default) -xunga- ordinal Deictics: 1: mi- Pers: -st- 2: du- Sing: -a- 3: se- Dem: -mp- 1+2: ci- Plur: -i- 1+3: be- Loc: -ng- 2+3: fa- Unspec: -u- 1+2+3: po- Tem: -lk- Tense/aspect: Tense Aspect ----- ------ Past: -lu- Perfect: -- (default) Present: -co- Imperfect: -nsa- Future: -ti- Iterative: -mpo- Unspecified: -ba- Habitual: -ntu- Inceptive: -spi- Continuative: -mbe- Terminative: -nzi- Completive: -ksu- Reserved: -ple- Reserved: -lto- Unspecified: -nda- Modals: Degree Modality ------------------------------------------------------------------- 100% pi- Probability (epistemic) -nte- High ge- Evidentiality (epistemic) -sna- Low so- Adequacy (epistemic) -ngo- Very low ju- Significance (epistemic) -mbe- 0% na- Obligation (deontic) -ndu- Undefined xe- Inevitability (deontic) -sko- Interrogative ku- Necessity (deontic) -tsi- Importance (deontic) -spu- Speaker-oriented obligation: -nka- Other Roots/MCMs: -bawa- bark, howl (action) -benzo- closed/shut/unopened (binary state) -bue- friend (relational state) -denga- dirty (scalar state) -gaya- female -gelba- yellow (scalar state) [also 'banana' (basic noun)] -gua- liquid (binary state) -hayu- heavy (scalar state), [also 'bear' (basic noun)] -ja- speech morpheme -jandoya- congratulate (speech act) -kanti- having quantity/amount (scalar state) -kapsu- same, equal -ke- up/above (locative state) -lenga- spatially long (scalar state) -lo- similar, like, about, approximately -me- located at/in (locative state) -muzge- grey (scalar state) -pau- contingency relationship, P is contingent on F -sawa- temporally long (scalar state) -sifne- reciprocal 'back/in return' -tenci- smart/intelligent (scalar state) -teyo- knowing/knowledgeable (mental state) -tomba- hedge -veya- real/existent (binary state) -xau- hot (scalar state) -xava- black (scalar state) -xawe- clean (scalar state) -xenda- feeling love/affection -xoya- alive/living (binary state) -xumpi- white (scalar state) -ze- generic action verbs (default A/P-s) Particles: Comparatives: taka 'than' geka 'more' pika 'most' kapsuka 'as much/many as' soka 'less' juka 'least' getaka = "geka taka" 'more than' pitaka = "pika taka" 'the most among' kapsutaka = "kapsuka taka" 'as much/many as' sotaka = "soka taka" 'less than' jutaka = "juka taka" 'the least among' Resumptive pronoun: ka Genitive: xaka Heavy topicalization particle: pika Referent-switching particle: boka Coordination initiator: ceka Coordination terminator: saksika Parenthetical start: suka Parenthetical stop: complete toka incomplete mika Other: neka 'and' pauka 'then' pauvika 'if' dengaka 'crap' deka 'shit' ********** The End **********