Designing an Artificial Language: Lexical Semantics by Rick Morneau First Internet draft date: July 30, 1995 Current draft date: March 23, 1996 Copyright 1995, 1996, 1997 by Richard A. Morneau [Note: this monograph is still in the draft stage, and will be undergoing continuous revision. Ultimately, it will become the design document for an actual language that I plan to use as an interlingua for machine translation. A copy of the most recent version will always be available at ftp.eskimo.com in directory /u/r/ram/conlang as file LexicalSemantics. If you read and enjoy the monograph and have any helpful comments or suggestions, please feel free to contact me at ram@eskimo.com. If you do contact me, please quote sparingly from the monograph, since I have to pay long distance phone charges for my Internet connection.] 1.0 INTRODUCTION In the following sections, I would like to discuss word design in artificial languages (henceforth ALs). Specifically, how can a language designer apply semantic rigor to the design of words? One could spend a lot of time studying formal semantics (also called Montague, truth-conditional, or model-theoretic semantics), but I doubt if this will help much. Formal semantics as currently practiced is not only a pseudo- mathematical monstrosity, but it deals primarily with representation and analysis at the sentence level rather than at the word level. (One could also argue that this school of semantics doesn't deal with 'meaning' at ALL. But I didn't really say that. :-) Other semantic disciplines, such as the prototype theories of cognitive linguistics, can certainly be helpful, since they are inherently more suitable for dealing with the basic fuzziness of language. However, prototype theories do not provide us with the formal tools we need to design new words. And as for fuzzy logic itself, there are still too many wrinkles to be ironed out, especially as applied to natural language. A more practical solution, one that I feel is much better suited to designing new languages rather than analyzing old ones, is presented below. Basically, the solution I propose is to develop a simple but powerful derivational morphology that makes word design rigorous yet straightforward, while at the same time greatly reducing the number of basic morphemes (i.e. _primitives_) required by the language. Initially, I will not try to describe this method in abstract terms, since this discussion is intended for the non-linguist. Besides, I doubt that I would succeed. Instead, I will present the reader with many examples of various kinds of linguistic constructions, discuss the semantics of these constructions, introduce linguistic terminology where and as needed, and finally, try to derive some productive generalizations. I must warn you, though, that when all is said and done, this method is formal and rigorous. Those among you who are more interested in CREATIVE solutions to word design, or who wish to borrow words and meanings directly from existing natural languages, will not find this approach very useful or very interesting. 2.0 VERBS I'll start this exposition by looking first at verbs. Specifically, I will look at two of the most important criteria that go into defining a verb: its _valency_ (i.e. the number of basic arguments that it requires) and its _case requirements_ (i.e. the semantic roles played by the basic arguments). When combined, the valency and case requirements of a verb are usually referred to as the _argument structure_ of the verb. Before proceeding, though, let me give you a quick review of valency and case. Consider the following English sentence: The chimpanzee broke the window with a coconut. In this example, the verb "break" has a valency of two, since it requires two arguments: the subject "the chimpanzee" and the object "the window". The arguments are REQUIRED because, if either were missing, the resulting sentence would be ungrammatical (or, in the case of some verbs, would have a different meaning): *The chimpanzee broke. *Broke the window. [Please note that I am using the standard linguistic convention of indicating an unacceptable item by preceding it with an asterisk.] But the following is okay: The chimpanzee broke the window. For the verb "break", the case role of the subject is _agent_, and indicates the entity RESPONSIBLE for the event. The case role of the object is _patient_, and indicates the entity which EXPERIENCES the state or change of state described by the verb. In other words, the argument structure of the English verb "break" requires two arguments: the first argument (i.e. the subject) must be a semantic agent, and the second argument (i.e. the object) must be a semantic patient. Arguments required by a verb are called _core_ arguments. The phrase "with a coconut" is what is called an _oblique_ argument since it is not essential for the sentence to be grammatical. It simply provides additional peripheral information about what happened. In this sentence, it indicates the _instrument_ of the event. In other words, "a coconut" is the instrument used in carrying out the act indicated by the verb. If the sentence had been: The chimpanzee broke a thousand windows in Boston on Tuesday. "in Boston" would be a locative oblique argument, and "on Tuesday" would be a temporal oblique argument. [The case terminology that I am using here is fairly common, but not universal. Linguists who work with case grammar and thematic relations have yet to agree on the number and nature of case roles needed to adequately describe natural language. As it turns out, this lack of agreement is irrelevant to what we are trying to accomplish here. We will, in effect, create our own internally consistent, semantically precise, and easily expandable implementation of a case system.] In English, oblique arguments are usually _marked_ by preceding them with a preposition. Thus, the preposition is the marker which tells us the case role of whatever follows it. Agent and patient are almost always _unmarked_. The most common exception to this in English is in passive constructions, where the original subject is preceded by the preposition "by", as in "the window was broken BY the chimpanzee" or "the thieves were seen BY the children". Some verbs, such as English "put", have a third, required argument (i.e., it is part of the valency of the verb), which is marked by a preposition. For example: *He put the book. He put the book on the table. Here, the preposition "on" marks a _destination_ case role. Incidentally, natural languages often allow a speaker to omit a core argument if it is obvious from context. For example, a Japanese speaker often omits the agent of a verb as a sign of politeness. This usage, however, performs a discourse function - NOT a grammatical function - and the omitted argument is still assumed to be present. An additional case role that occurs within the valency of many verbs is what I will call _focus_. Linguists often call this case role _theme_, _object_, or _topic_, but there is no concensus, and their definitions often overlap other roles, especially patient. In all of the following examples, the direct object is the focus: The children saw the thief. The team needs a new coach. The woman remembered her father. The boys are playing baseball. The woman owns a beachhouse. The tarp covered the boxes. The fans enjoyed the game. The employees learned discretion. The man ignored his wife. The choir is singing a requiem. The boy loves his mother. The class is studying French. The fence surrounds three buildings. The old man told a story. Note that in each of the above sentences, the direct object provides a reference point or focus for the event, without causing or being changed by the event. It does this by pinpointing, narrowing down, or providing a reference for (i.e. 'focusing') the state or change of state indicated by the verb. Note that a focus does not play an active role in the event described by the verb, and is not obviously changed by the event. Thus, a focus can be best described as one of the following: 1. The entity on which the patient's attention or mental state is 'targeted' or 'focused'; e.g. to see, to play, to learn, to love, to tell, etc. 2. The referent of a relationship with the patient (i.e. the patient's state relative to the focus); e.g. to own, to surround, to include, to need, etc. 3. An elaboration of the event itself; e.g. to play, to sing, to tell, etc. Note that the concepts can overlap, as in "to need", "to avoid", "to know", and "to hate", since the object of such verbs can be considered the focus of a relationship OR of a mental state. In fact, without stretching the second definition too much, one could say that it applies to ALL focused events, even those involving perception or elaboration. For example, the sentence "John sees the forest" describes a relationship between "John" and "the forest", and the sentence "Louise sang a little ditty" describes a relationship between "Louise" and "a little ditty". Thus, we can say that the patient experiences a relationship whose referent is the focus. If the verb has an agent, then the agent is responsible for the relationship. The nature of the relationship is indicated by the meaning of the verb. It is important to keep in mind that the focus does not directly modify or interact with the patient. Perhaps the best and most useful generalization we can make is that the focus is the referent of a relationship with the patient, it is not affected by the event, and it is not responsible for the event. However, the precise meaning of the focus will ultimately depend on the meaning of the verb itself. Thus, it would appear that focus is not really a pure case role. Both agent and patient can be defined with semantic precision, while focus seems somewhat vague or even 'out-of-focus'. The reason for the vagueness is that it is possible to differentiate among the various senses of focus; e.g. the perceived entity ("to see"), the missing/lacking entity ("to need"), the locative reference point ("to surround"), an elaboration of the event itself ("to sing"), etc. But these senses never overlap for a particular verbal concept, and we would end up making distinctions that are never made in natural languages. Thus, focus IS a vague and general-purpose case role, but it is an essential one. In summary, the three major case roles that are capable of being included within the valency of a verb are: agent - the entity responsible for the event described by the verb patient - the entity which experiences the state or change of state described by the verb focus - the entity which acts as the referent of a relationship with the patient Thus, the agent is responsible for the event, the patient experiences the event, and the focus provides the referent for the state or change of state indicated by the event. [We will discuss the semantics of focus in more detail later on. First, though, we need to acquire a more substantial background in the semantics of verbs.] Note that an argument does not have to be a physical entity. It can also be an event. In the following examples, the direct object is the patient: We lengthened our trip. The police halted the procession. Bill chaired the seminar. Joe postponed the finance committee meeting. The army prevented the destruction of the village. The station repeated the broadcast. There are other case roles in addition to the ones I just mentioned, but they are all oblique (i.e., they are never required by a verb), or are only seen in a few verbs best categorized as oddballs ("to suit", "to elapse", "to postpone", etc.). I will discuss them as the need arises. For now, though, we have enough background to proceed with the discussion. In the following sections, I will discuss and classify a large and varied sample of English verbs, based on their semantics and their argument structures. While doing so, I will also introduce some of the terminology and the formal descriptive notation that I will be using throughout the remainder of this monograph. 2.1 VERB CLASSIFICATION - STATE VERBS Probably the largest group of verbs in English (or any language, for that matter) are called _state_ verbs, since they describe either an unchanging state of affairs or a change of state. Verbs which describe an unchanging or static situation are often called _stative_ verbs (do not confuse "stative" with "state"). Verbs which describe a changing or dynamic situation are often called either _process_ or _accomplishment_ verbs. Because linguists do not agree on the precise meanings of these terms, I will immediately abandon them and use the more generic expressions "static state verbs" and "dynamic state verbs". Let's start by looking at some static state verbs; i.e. verbs which describe a steady or ongoing state: The patients suffered. The boy sweated. The building shook. The baby slept. The fish stank. The stars twinkled. These verbs are all _intransitive_; i.e. they have a subject but no object. Also, each one describes the steady, ongoing state of the subject. Thus, the subject is the patient. From now on, I will refer to verbs of this type as "P-s", where "P" represents "patient" and "-s" indicates that the verb is a STATIC verb. Here are some more static state verbs with the form P-s: The plants were tall -> P-s verb = "to be tall" The door was closed -> P-s verb = "to be closed" The stew was salty -> P-s verb = "to be salty" The walls were blue -> P-s verb = "to be blue" The mouse was dead -> P-s verb = "to be dead" English speakers may be surprised to see adjectives and past participles being treated as descriptive verbs. However, words which describe steady states have just as much of a verbal nature as words which describe changes of state. The English verbs "to sleep", "to stink", "to twinkle", etc. illustrate this very well. In fact, many natural languages (e.g. Japanese, Korean, several Sino- Tibetan languages such as Mandarin Chinese, some Siouan languages, several Austronesian languages, and many native languages of Africa, Central America and South America) do not have true adjectives. Instead, these languages use words that are essentially intransitive verbs, and which can be inflected or otherwise used in the same way as any other intransitive verbs. Now, the above examples represent intransitive STATIC state verbs. Here are some examples of intransitive DYNAMIC state verbs: The window broke. The ice melted. The plants grew. The baby fell asleep. The mouse died. The stew cooled. The patient recuperated. The only difference between these and the previous examples is that the patient experiences a CHANGE of state rather than a steady state. Thus, these verbs are the dynamic counterparts of the intransitive static state verbs. From now on, I will refer to these verbs as "P-d", where "-d" indicates that the verb is a DYNAMIC verb. Next, let's look at some verbs which describe events in which the subject causes something to happen to the object. These verbs are all _transitive_; i.e. they have both a subject and an object. Here are a few examples: He cured the patient. He broke the window. He killed the mouse. He closed the door. He salted the stew. He captured the thief. In all of the above, the subject "He" is responsible for the event described by the verb. Also, in all cases, the event causes a CHANGE OF STATE to occur in the object. Thus, the subject is the agent and the object is the patient. In other words, these verbs are transitive dynamic state verbs. For verbs like these, I will use the notation "A/P-d", where "A" represents "agent", "P" represents "patient", a slash "/" separates subject from object, and "-d" indicates that the verb is a dynamic verb. Note that English, unlike almost all other languages, uses exactly the same word for SOME of its P-d and A/P-d verbs: P-d: The window broke. A/P-d: John broke the window. P-d: The patient healed. A/P-d: The doctor healed the patient. Note though, that this usage is highly idiosyncratic, and many words that you would expect to follow the pattern do not: A/P-d: The doctor cured the patient. P-d: *The patient cured. P-d: The patient recuperated. A/P-d: *The doctor recuperated the patient. A/P-d: The cat killed the mouse. P-d: *The mouse killed. P-d: The mouse died. A/P-d: *The cat died the mouse. In the design of your AL, you can, of course, mimic English. However, I do not recommend it, since it is extremely uncommon among the world's languages and it is almost certain to cause confusion among most non-English speakers. So far, we've seen P-s, P-d, and A/P-d verbs. Thus, an obvious question is: are there such things as A/P-s verbs? Yes. And as the designation implies, these verbs always indicate that the agent maintains the patient in some kind of steady state. Thus, all of these verbs imply that the agent somehow "controls" the patient. Here are some examples: He operated the lathe. He ruled the country. He conducted the orchestra. He chaired the symposium. He used the hammer. He managed the company. Note that, although these verbs may imply both an entry into and an exit from the event or situation, the major emphasis is on the process BETWEEN the endpoints. For these reasons, these verbs are static rather than dynamic. Now, for states that are normally rendered using adjectives, English uses the particle "keep" to distinguish between A/P-s and A/P-d verbs. Here are some examples: He kept the door open. A/P-s verb = "to keep open" He kept the girl alive. A/P-s verb = "to keep alive" He kept the thief captive. A/P-s verb = "to keep captive" He kept his mother happy. A/P-s verb = "to keep happy" All of the above are effectively A/P-s verbs. English simply uses the particle "keep" to achieve the desired effect. A good paraphrase of these 'verbs' is "agent causes patient to remain in a steady state". Next, let's look at some verbs that use the focus case role that we discussed earlier. Here are some examples: The student needs money. The boy misses his father. The company owns the yacht. The child has the coloring book. The report lacks a cover. The kids enjoy the game. The man loves his wife. The policeman sees the thief. The girls hear the music. In all of the above, the subject experiences a steady state relative to the object. Thus, the subject is a patient, the object is a focus, and the verb is a static state verb. For these verbs, I will use the notation "P/F-s", where "F" represents the focus. It is also possible to have verbs like these which also have an agent. Here are some examples: The boy wanted the money. The lady looked at the house. (Think of "to look at" as a single complex verb.) The boys obeyed the rules. The girls listened to the music. (Think of "to listen to" as a single complex verb.) The children followed their parents. The priest thought about his sins. (Think of "to think about" as a single complex verb.) In the above examples, the subject not only experiences the state indicated by the verb, but is also responsible for the state; i.e., the subject is also in control. Thus, the subject is both the agent AND the patient, and the object is the focus. I will refer to these verbs as AP/F-s. Incidentally, notice how some of the above complex verbs become simple verbs when they are de-focused: The lady is looking. The lady is looking at the house. The girls are listening. The girls are listening to the music. The priest is thinking. The priest is thinking about his sins. Thus, the unfocused verbs would be described as AP-s. Later, we'll have much more to say about the difference in semantics between focused and unfocused verbs. It is also possible for AP/F verbs to indicate a change of state. Here are some examples: Louise befriended her classmate. Mike joined the party. John memorized the poem. John entered the room. Mary divorced the jerk two years ago. The man disowned his oldest son. Bill left the building. These verbs describe a situation in which the agent causes HIMSELF to undergo a change of state relative to the focus. Thus, they are all AP/F-d. Since all of this may be confusing, let me paraphrase the relationships in a way that illustrates the states and how they are focused: P/F-s: John saw the mouse. = John experienced a visually perceptive state focused on the mouse. AP/F-s: John looked at the mouse. = John maintained himself in a visually perceptive state focused on the mouse. P/F-d: John noticed the mouse. = John entered a visually perceptive state focused on the mouse. AP/F-d: John glanced at the mouse. = John caused himself to enter a visually perceptive state focused on the mouse. P/F-s: The platoon heard the music. = The platoon experienced an aurally perceptive state focused on the music. AP/F-s: The platoon listened to the music. = The platoon maintained itself in an aurally perceptive state focused on the music. P/F-d: John remembered the party. = John entered a state of remembrance focused on the party. AP/F-d: The platoon surrounded the village. = The platoon caused itself to be in a state of 'around' focused on the village. P/F-s: He loved his father. = He experienced a state of loving focused on his father. P/F-d: She learned discretion. = She entered a state of knowledge focused on discretion. Overall then, verbs in this group can be generalized as follows: P/F-s: Static, subject = patient only (to hear, to love) X experienced a steady state focused on Y AP/F-s: Static, subject = agent & patient (to look at, to listen to) X maintained himself in a steady state focused on Y P/F-d: Dynamic, subject = patient only (to remember, to learn) X underwent a change of state focused on Y AP/F-d: Dynamic, subject = agent & patient (to glance at, to surround) X caused himself to undergo a change of state focused on Y Note that in all of the above paraphrases, the words "focused on" could be replaced by the words "relative to", emphasizing that the focus is the referent of a relationship with the patient. Now, some verbs involve the exchange of one item for another, usually between two people. Here are some examples: John swapped an apple for an orange with Bill. John sold Bill a book for $10. Bill bought a book from John for $10. John loaned Bill his tiller for $10. Bill rented a tiller from John for $10. In each case, two transfers of possession take place. John loses possession of one item while gaining possession of another, and the reverse change of possession occurs for Bill. Thus, we have, in effect, two patients and two foci, where the foci are the items being exchanged. We can also regard these verbs as composites; i.e. useful abbreviated versions of two distinct verbs, as in "John gave me his apple and I gave him my orange". Since both patients are equally responsible for the exchange, each one functions as both agent and patient. However, the subject in the above exchanges plays a more important or 'primary' role as agent than the other patient, and the first item plays a more important or 'primary' role as focus. Thus, for example, in the case of "sell", the seller is the primary agent- patient, while the buyer is the 'secondary' agent-patient. The object sold is the primary focus, and the amount paid is the 'secondary' focus. [This is not the only possible analysis, but I feel that it is the most practical. It also eliminates the need for any special treatment of exchange verbs that do not need a secondary focus, such as "to lend/borrow".] Finally, there are some cases where the subject is the ONLY agent-patient, as in "John swapped his brown tie for a blue one". Here, John causes himself to undergo a change of relationship with two different items, without the involvement of anyone else. In this example, "a blue one" is the secondary focus. There are also state verbs which are used to describe the weather and other environmental phenomena. Here are some examples: It's raining. It stinks in here. It's windy. It's cold outside. It's snowing. It's scary in there. It's humid today. It's dark in there. It's getting hot = it's heating up. It's getting cloudy = it's clouding up. It's quiet when the kids are at school. In this group of verbs, the subject is the null place holder "it". English verbs always require a subject in the indicative, but this is not true of most languages. Whether or not you require it in your AL is up to you, but it is not semantically meaningful, and I recommend against it. Note that verbs in this class can be either static or dynamic. Also note that, since these verbs describe states or changes of state, they have an IMPLIED patient which is obvious from the context (i.e. the local environment or current situation). In effect, English uses the pronoun "it" to represent the implied patient. I will not describe the argument structure of these verbs right now, because we do not yet have a sufficient background to treat them properly. Instead, I will postpone their discussion until after we discuss grammatical voice changes. 2.2 VERB CLASSIFICATION - DYNAMISM AND TELICITY So far, all of the verbs we have discussed are state verbs. That is, the basic concept represented by such a verb is some kind of state, and that this state applies only to the patient. The states can be focused or unfocused, and they can be brought about or maintained with or without an agent. Also, the states themselves can be categorized by their dynamism; i.e. a state can be "energetic" (e.g. 'alive', 'twinkling', 'sleeping', 'smelly', etc.) or "non-energetic" (e.g. 'dead', 'green', 'tall', etc.). In general, an energetic state can be described using an English present participle, and a non-energetic state can be described using an English adjective or past participle, but there are many exceptions. Verbs may also be categorized according to their _telicity_. Dynamic verbs that have a built-in endpoint are called _telic_, as in "The violinist played a dirge". Dynamic verbs that do NOT have a built-in endpoint are called _atelic_, as in "The violinist played with the local orchestra". Unfortunately, distinctions in dynamism and telicity are not very useful, and I know of no natural languages that mark these distinctions. Whether a concept is energetic or not is a basic part of the nature of the concept and has nothing to do with how the concept is applied. In other words, it is an inherent part of meaning of the verb root, and there is no need to mark it or express it externally. Also, the telicity of a verb often depends on the meaning of its arguments rather than on the form of the verb. Thus, in a derivational system such as I am proposing here, telic distinctions are useless. [Incidentally, this entire section is 'for your information only'. I felt that it was important to mention dynamism and telicity only because linguists attribute so much importance to these concepts in their theoretical discussions about verbs. In the process, they often end up weaving complex webs of rationalization that end up leading nowhere. I got caught in this web in my earlier studies in lexical semantics, but managed to extricate myself after a considerable waste of time. In my opinion, distinctions in dynamism and telicity are interesting but useless. And, as I will illustrate below, there is a much more important and useful distinction: the distinction between agent- oriented concepts and patient-oriented concepts.] 2.3 VERB CLASSIFICATION - ACTION VERBS State verbs are not the only kind of verbs that languages employ. There is one other class of verbs, which I will refer to as _action_ verbs, which differ significantly from state verbs. Let's look at a few examples and then see if we can deduce some useful generalizations: Louise told Bill a joke. Louise kicked Bill. Louise teased Bill. In each of the above examples, the subject "Louise" is clearly the agent. Also, in the first example, the second object is clearly the focus. But what is the object "Bill"? In each case, Louise is trying to have some kind of effect on Bill, but the final result is not clear. For example, when Louise kicks Bill, we know that something happens to Bill, but the actual outcome depends on many things that are left unstated, such as how hard she kicked, what kind of shoes she was wearing, where she kicked Bill, and so on. This is quite different from state verbs, where the final state is always clearly indicated by the meaning of the verb. For example, the sentence "He broke the window" makes it very clear what the final state of the window is; i.e. 'broken'. It doesn't tell us anything about the act itself or how it was accomplished. Now, we could say that Bill's final state is 'kicked', but this does not tell us about his condition - it simply tells us how it was accomplished. The reason why the final outcome of the above examples is not clear is because these verbs tell us about the act itself rather than the outcome of the act. In other words, these verbs emphasize what the agent is doing rather than emphasizing what is happening to the patient. Thus, state verbs are _patient-oriented_, since they highlight what the patient experiences. Action verbs are _agent-oriented_, since they emphasize what the agent is doing. If a root concept is patient-oriented, then the verb will indicate what the patient experiences. Patient-oriented verbs may or may not have agents. If the root concept is agent-oriented, then the verb will indicate what the agent is doing. An agent-oriented verb MUST have an agent. All patient-oriented verbs are state verbs. All agent-oriented verbs are action verbs. The most common action verbs are _speech acts_. Here are some examples: He advised his clients. He blessed the crowd. He told me a joke. He mocked them. He answered the teacher. He called me an idiot. He blamed John for the accident. He dared me to try it. In all of the above the first object is the patient, since it is the entity which the agent is trying to affect. For the verbs which have two objects, the second object is the focus. Thus, in the sentence "He told me a joke", "He" is the agent, "me" is the patient, and "a joke" is the focus. [Incidentally, verbs which have two objects are called _ditransitive_.] Now, a verb like "to mock" is different from a verb like "to kick" because "mock" is a speech act while "kick" is a physical act. However, this difference is not important for our purposes. A much more important difference is that "kick" implies success, while "mock" does not imply success. In other words, if you are kicked, then you experience a change of state, even though the verb itself does not tell us what the final state is. If you are mocked, however, the verb does not indicate if the mocking was successful. In fact, most speech acts do not even guarantee that the patient hears the agent. Thus, the speech acts involve an agent ATTEMPTING to affect, influence, or transfer information to a patient using speech. In effect, the patient can POTENTIALLY undergo a change of state, but the outcome is not certain. I will use the designator "-p" ("p" for "potential") for verbs which indicate that the patient may POTENTIALLY experience a change-of-state in the event described by the verb. Thus, the complete designation of the above speech acts are as follows: A/P-p: He advised his clients. A/P-p: He blessed the crowd. A/P/F-p: He told me a joke. A/P-p: He mocked them. A/P-p: He answered the teacher. A/P/F-p: He called me an idiot. Note that I'm using TWO slashes to separate the arguments of ditransitive verbs. Now, the astute reader will be asking why I chose to define "-p" verbs as I did above; i.e. as verbs which indicate that the patient is POTENTIALLY affected. After all, I spent a lot of time defining the difference between agent-oriented and patient-oriented concepts, but I never mentioned this difference when I defined the new "-p" class of verbs. The reason is that the orientation of a verb is an inherent part of the verb's root concept, and has nothing to do with how the root is used. For example, a verb root representing the agent-oriented concept 'kick' could be used to create the following verbs: A/P-d: to kick A/P-p: to kick at Here, the "-d" verb indicates that the patient experienced a change of state, even though the actual final state can only be guessed at. The "-p" verb, however, simply indicates that the agent TRIED to affect the patient, and says nothing about whether the agent was successful. Note though, that both the "-d" and "-p" versions are action verbs. The root concept is simply being applied differently. Similarly, a verb root representing the patient-oriented concept 'scratched' could be used to create the following verbs: P-s: to have a scratch or scratches P-d: to get/become scratched A/P-d: to scratch A/P-p: to scratch at In fact, some English verbs can be converted from a "-d" state verb to a "-p" verb using the preposition "at" and occasionally "on". Here are some examples: He cut the rope. -> dynamic state verb He cut at the rope. -> potential state verb He grabbed the rope. -> dynamic state verb He grabbed at the rope. -> potential state verb He shot the deer. -> dynamic state verb He shot at the deer. -> potential state verb He pulled the rope. -> dynamic state verb He pulled on the rope. -> potential state verb He tugged the rope. -> potential state verb Note that the dynamic state verbs imply that a final state is achieved as well as what the final state is. The potential counterparts indicate only that an attempt is made with no guarantee of the final outcome. In English, the above "at" construction is not always productive. However, the generic verbs "to have at", "to try", or "to work at" can be used if the combination of dynamic state verb plus "at" or "on" is either unacceptable or has an inappropriate meaning, as in the following examples: He opened the door. -> dynamic state verb He broke the door. -> dynamic state verb He painted the door. -> dynamic state verb He tore down the door. -> dynamic state verb He had at the door. -> potential state verb He tried the door. -> potential state verb He worked at the door. -> potential state verb In summary, it's important to be able to distinguish between "-s", "-d", and "-p" verbs, whether they represent patient-oriented or agent-oriented root concepts. In effect, a particular verb (either a simple verb, such as "kick", or a complex one, such as "kick at") can be categorized as one of six possible types: Actions verbs: -s: "to dictate" -d: "to kick" -p: "to tell" State verbs: -s: "to stink" -d: "to break" -p: "to tug" Finally, we mentioned earlier that the focus of a verb can be one of the following: 1. The entity on which the patient's attention or mental state is 'targeted' or 'focused'; e.g. to see, to play, to learn, to love, to tell, etc. 2. The referent of a relationship with the patient (i.e. the patient's state relative to the focus); e.g. to own, to surround, to include, to need, etc. 3. An elaboration of the event itself; e.g. to play, to sing, to tell, etc. We can now state a very important observation regarding the focus of action verbs: The focus of ALL action verbs MUST be (3) above. However, there can still be an overlap. Thus, although an action verb MUST be in category (3), it can also be in another category. For example, because "sing" is an action concept, the focus must elaborate the event, as in "John sang a little ditty". However, it can also fall into category (1), since the object "a little ditty" can be considered the focus of the mental state of the patient. There is another group of action verbs that are typically referred to as _activities_. Here are some examples: The children played (hide and seek). The athletes ran (the marathon). The guests danced (the polka). The old hag smoked (a pipe). The boy read (a good book). These verbs describe situations in which the agent maintains itself in an ongoing, energetic state. As a result, these verbs are all static AP/F-s verbs, and can be paraphrased as "Agent does something to maintain itself in a steady, active state". In effect, since the agent and the patient are the same, and since an action verb tells us what the agent is doing, it also tells us the state of the patient. In other words, the action and the state are essentially the same. [Incidentally, some readers might argue that the objects of the verbs "smoke" and "read" in the above examples do not elaborate the event, as is required of all action verbs. However, I disagree. Just because the objects "a pipe" and "a good book" are noun phrases representing physical objects does not mean that they cannot represent events. In fact, these simple noun phrases actually evoke complete events because of their inherent natures. A book does not serve its main purpose if it is not read and a pipe does not serve its main purpose if it is not smoked.] Now, many activity verbs CAN take an explicit patient that is not also the agent. Here are some examples: John played Bill three games of chess. The athletes ran their sneakers threadbare. His wife danced him into a stupor. She smoked us out of the house (i.e., her smoking caused us to leave). The boy read his sister a story. In these examples, we are still saying what the agent is doing while placing more emphasis on what is being done to someone/something else. Thus, these verbs are the A/P versions of the basic activities. [Incidentally, the word "threadbare" in the "run" example, and the expressions "into a stupor" in the "dance" example and "out of the house" in the "smoke" example are called _resultatives_, since they indicate the final or 'result' state of the patient. We'll have more to say about resultatives later.] It's important to emphasize that, when dealing with action concepts, we cannot treat AP derivations as we did with state verbs. In an AP state derivation, the agent is causing itself to experience the state that normally applies only to the patient. In an AP action derivation, the agent is causing the patient to perform the action that is normally performed only by the agent. In other words, in an AP state derivation, the agent EXPERIENCES the same thing (i.e. state) as the patient. In an AP action derivation, the patient DOES the same thing (i.e. action) as the agent. Thus, an AP-s version of a verb such as "to kick" does NOT mean that the agent kicks himself. Instead, it means that the agent is simply "kicking"; i.e., he is involved in the activity of "kicking" with no specified or discernable target. This is a subtle distinction, but it is an extremely important one. [Incidentally, this distinction could also be handled by designating the above verb as simply A-s rather than AP-s. However, I have chosen to keep the AP notation because of the inherent symmetry of the distinction, and because it emphasizes that the agent is causing itself to experience what is essentially an energetic "state".] 2.4 GENERALIZATIONS ABOUT VERBS Now, let's look at some of the distinctions that exist among these categories, and see if we can make some generalizations about verbs. In looking over the above groupings, we can draw the following conclusions: 1. All verb concepts are either: a. Patient-oriented -> the root describes the ongoing or final state of the patient. b. Agent-oriented -> the root describes what the agent is doing. 2. All verbs are either: a. Static verbs -> these indicate that the patient experiences a steady state. b. Dynamic verbs -> these indicate that the patient experiences a change of state. c. Potential verbs -> these indicate that the patient MAY experience a change of state. 3. The subject of a verb can be any of the following: a. Agent b. Patient c. Both agent and patient d. Nothing 4. The object of a verb can be any of the following: a. Patient b. Focus c. Nothing 5. Some English verbs take three arguments. In these cases, the subject is the agent, the first object is the patient, and the second object is the focus. 6. All verbs have a patient, whether stated or implied. As mentioned earlier, there are a few oddballs which have unusual argument structures, but these are rare and tend to be irregular or idiosyncratic. For the time being, we will limit our discussion to the larger, more regular categories. [Actually, as we will see throughout this monograph, the so-called 'oddballs' can ALWAYS be derived from more regular verbs via some form of grammatical voice change or re-classification.] From the above list, we might be tempted to create a matrix of 2x3x4x3x2 = 144 elements. However, most combinations never appear. Note, for example, that the orientation of the verb is an inherent part of the meaning of the root, and we will never find two verbs that differ ONLY in this characteristic. Also, a patient can be the subject OR the object - not both - which, of course, makes sense. And if the first argument is both agent and patient, then the second argument cannot be a patient. Also, it serves no useful purpose to have a verb with an object but with no subject. And so on. With all of the above in mind, we can construct a chart of the possible forms that verbs can take: ARGUMENTS STATIC DYNAMIC POTENTIAL -------------------------------------------------------------------- A/P/F to conduct to teach to tell A/P to manage to cure to shoot at AP/F to ignore to memorize to study AP to behave to escape P/F to see to recall P to stink to recuperate none to be cloudy to cloud up Note that I left most of the "potential" category empty. We WILL be creating many of these verbs, but to list them now would only cause confusion. Note also that I have excluded verbs that take instrumental subjects (e.g. "The hammer broke the window"). English is one of the very few languages that allows constructions like this. And those few that do allow this generally mark the verb to indicate that the subject is instrumental (e.g. Malagasy, many Bantu languages, many Philippine languages, etc.). I'll have more to say about this later. I have also excluded verbs that can take a _beneficiary_ as an indirect object, as in "He baked the kids a cake". A beneficiary is someone who may only be indirectly affected by the event, unlike the patient which is directly affected. I'll discuss the beneficiary case role later. 2.5 VERB DESIGN ALGORITHM So, how do we apply these generalizations to the practical problem of verb design? Answer: we do it by CLASSIFYING and MARKING our verbs (in some way or other) to indicate their valency, case requirements, and whether or not they reflect a steady state, change of state, or potential change of state. The easiest way to do this, in my opinion, is to design the morphology of the language to reflect these differences. For example, the following English verbs will all be derived from the same root but will have different markers to indicate their different argument structures: AP-d to escape = Agent causes self to become free AP/F-d to escape from = Agent causes self to become free relative to focus A/P-d to release, to free, to liberate = Agent causes patient to become free A/P/F-d to release from, to free from = Agent causes patient to become free relative to focus P-d to get loose, to become free = Patient becomes free P/F-d to get loose from, to become free of = Patient becomes free relative to focus AP-s to stay free, to remain free = Agent keeps self free P-s to be free = Patient is free And so on. For all of the above, we can use a state root with the meaning 'free' or 'unrestrained', and can apply a different marker to indicate whether the result is AP-s, A/P-d, etc. Now, let's illustrate this approach in greater detail by precisely defining the morphology that we will use, and providing lots of examples of how to use it. Here is a formal description of the sample morphology that we will use: Word ::= Root + ( Classifier ) + Part-of-speech Root ::= Morpheme Classifier ::= Morpheme Morpheme ::= C D | C V { X } Part-of-speech ::= Terminator Terminator ::= da | no | pe | si C = any consonant (p, b, t, d, k, g, c, j, l, m, n, f, v, s, z, x) D = any diphthong (ai, au, eu, ia, ie, io, iu, oi, ua, ue, ui, uo) V = any vowel (a, e, i, o, u) S = any semi-vowel (w, y) X = extension = S V | C C V | = logical 'or' () = enclosed item is optional {} = enclosed item may appear zero or more times Lower case letters represent themselves If you have difficulty understanding the above formal description, I suggest that you read my separate essay entitled "Morphology". It provides a brief and simple tutorial on how to describe the shapes of words and morphemes. The above is a portion of the complete morphological system that I will be using throughout this monograph. Additional features will be introduced as needed. Appendix A contains a complete description, and Appendix B contains a list of all of the morphemes that will be created and used in the monograph. Pronounce vowels as in Italian, Swahili, or Japanese (i.e. the five cardinal vowels /a/, /e/, /i/, /o/, and /u/). Pronounce consonants as in English, except for the following: "c" is like "ch" in "church", and "x" is like "sh" in "she". In two-vowel clusters, you may optionally pronounce "i" like the semi- vowel "y", and "u" like the semi-vowel "w". The diphthong "ui" should be pronounced /wi/ and "iu" should be pronounced /yu/. Main stress should be applied to the next-to-last syllable. Secondary stress should be applied to the second syllable if there is at least one unstressed syllable between it and the syllable that receives the main stress. [Incidentally, a diphthong is simply a VSV in which the semi-vowel has been deleted; the semi-vowel 'w' is deleted if either V is 'u', and the semi-vowel 'y' is deleted if either V is 'i'. Thus, for example, "-io-" is actually "-iyo-", "-eu-" is actually "-ewu-", "-ui-" is actually "-uwi-", "-iu-" is actually "-iyu-", and so on.] For verbs, I will use the terminator "-si" to indicate the part-of-speech. Thus, examples of well-formed verbs are: baposi xendisi moyakuesi taskesi guyondetayusi ba-po-si xendi-si moya-kue-si taske-si guyonde-tayu-si where hyphens show the boundaries between roots, optional classifiers, and terminators. For example, the word "baposi" consists of the root "ba", the classifier "po", and the terminator "si". The word "xendisi" consists of only the root "xendi" and the terminator "si" - it does not have a classifier. Incidentally, note that a word with the above morphology can always be parsed unambiguously into its component morphemes, and that a stream of words can always be divided unambiguously into individual words even if there are no spaces between words. Thus, the boundaries between morphemes and words is never in doubt. This feature of word morphology is usually referred to as either _self-segregation_ or _auto-isolation_. Because of the self-segregation rules, a morpheme cannot have the form of a terminator. For example, a root with the form "-da-" is illegal because "da" is a terminator. However, "-daye-", "-daspo-", and "-sunda-" ARE legal because the extensions "-ye-", "-spo-", and "-nda-" create unique morphemes. For verb classifiers, let's use the following morphemes: A/P/F-s: -tue- A/P/F-d: -ko- A/P/F-p: -nio- A/P-s: -zoya- A/P-d: -pu- A/P-p: -ce- AP/F-s: -fi- AP/F-d: -sua- AP/F-p: -mi- AP-s: -panji- AP-d: -za- AP-p: -diu- P/F-s: -ma- P/F-d: -do- P/F-p: -gui- P-s: -se- P-d: -pia- P-p: -moncu- Now, before proceeding, let's briefly review the semantics behind the notation we are using. All verbs have a patient, whether stated or implied. If a verb has an agent, then the agent is responsible for the event described by the verb. If a verb has a focus, then the focus is the referent of a relationship (or potential relationship) with the patient. This referent can be either another entity, as in "John needs a pencil", or it can elaborate the event itself, as in "John told a joke". A verb is either an agent-oriented action verb or a patient-oriented state verb. An action verb emphasizes what the agent is doing rather than what the patient is experiencing. A state verb emphasizes the ongoing or final state of the patient rather than how it came about or how the agent, if any, brought it about. An action verb MUST have an agent. A state verb may or may not have an agent. If an action verb has a focus, then the focus MUST elaborate the event. Finally, it is extremely important to note that many states are so strongly scalar that it makes no sense to think of them in absolute terms. For example, if I say that something is 'long', I mean that it is long relative to some referent understood by both the speaker and the listener. When deriving dynamic verbs, we will always use state concepts such as these in a relative sense, rather than in an absolute sense. Thus, an A/P-d verb formed from the state concept 'long' will mean 'to lengthen' - it will NOT mean 'to make long'. This convention will be followed rigorously in the derivation of ALL state verbs that indicate a change in an inherently scalar state. 2.5.1 VERB DESIGN EXAMPLES For these examples, I'm going to start with an English verb, analyze it to determine its argument structure, and create a word for it in our sample language. I will then try to create as many other verbs as possible from the SAME root by using different classifiers. Let's start with the verb "to know", in the sense of 'having knowledge'. Typical sentences using this verb could be: He knows the answer. or He knows calculus. Here, the subject is the patient and the object is the focus. The subject experiences a steady state of 'knowledgeable' focused on the object. Thus, this verb is a patient-oriented state verb and is classified as P/F-s. Now, in our sample language, I will assign it the root morpheme "-teyo-". Thus, the root "-teyo-" will represent the state concept that means 'knowing' or 'knowledgeable'. Next, if we add the classifier "-ma-" to indicate that the argument structure is P/F-s, and add the terminator "-si" to indicate that the word is a verb, then the resulting word "teyomasi" is the verb meaning 'to know'. We will also adopt the convention that each root will have a default class if none is explicitly provided, and that the default class will depend on the meaning of the root. Since 'knowing' is inherently relational, the default class for "-teyo-" will be P/F-s. Thus, the words "teyosi" and "teyomasi" are synonymous. [Later, we will discuss how to shorten many words even further when we make the distinction between 'high' and 'low' versions of the language. The high language will be used in formal occasions and when talking to computers (because computers require a self-segregating morphology). The low language can be used when talking informally to people. In the low language, the word "teyomasi" can be further shortened to simply "teyo".] Next, let's take the same root and see what happens when we apply different classifiers to it. Note that the English glosses are approximations - I'll have more to say about this later. We will deal first with focused verbs, since the concept of 'knowing' is inherently focused: A/P/F-s: "teyotuesi" = 'to keep (someone else) current in (something)' Agent maintains patient's knowledge. e.g. He keeps them up-to-date in company procedures. A/P/F-d: "teyokosi" = 'to teach (someone) (something)' Agent causes patient to gain knowledge. e.g. He taught them French. A/P/F-p: "teyoniosi" = 'to instruct (someone) in (something)' Agent attempts to cause the patient to gain knowledge. e.g. He instructed them in table manners. AP/F-s: "teyofisi" = 'to review', 'to keep oneself current in (something)' Patient maintains his knowledge. e.g. He reviewed the day's lessons every evening. [Note that this verb implies that the patient is successful in maintaining his knowledge. Thus, the English word "review" is not a perfect match.] AP/F-d: "teyosuasi" = 'to self-teach (something)', 'to determine/learn' Patient causes himself to gain knowledge. e.g. He taught himself French OR He learned French on his own. He determined the meaning with the help of a dictionary. AP/F-p: "teyomisi" = 'to study', 'to attempt to teach oneself (something)', 'to try to determine/learn' Agent attempts to cause himself to gain knowledge. e.g. He studied the new lesson. P/F-s: "teyosi = teyomasi" = 'to know (something)', 'to understand (something)' Patient is knowledgable. e.g. He knows the rules of the game. P/F-d: "teyodosi" = 'to learn (something)', 'to come to know' Patient gains knowledge. e.g. He learned the rules by watching the others. P/F-p: "teyoguisi" = 'to be exposed to knowledge of (something)' Patient has an experience that may potentially increase his knowledge. e.g. He was exposed to French for three years, but never learned a word of it. Keep in mind that the above English glosses are approximations, and that the real meaning should be determined from the root plus its argument structure. For example, "teach" implies that the teaching was successful because a dynamic state verb indicates that a change-of-state did, in fact, occur, while "instruct" does not imply success, but only that there was potential for success. With the precisely defined semantics used above, there is no doubt. Also, keep in mind that the paraphrases cannot capture the IMMEDIACY of the involvement of the participants. This immediacy can only be represented by the single word - NOT by the paraphrase. For example, a paraphrase of the verb "to kill" is 'to cause to die', even though the two are not synonymous. The paraphrase is simply the closest we can get to the true meaning using multiple words. Please keep this in mind, since we will be using paraphrases throughout this monograph. Note that all of the above derivations are focused. Focused derivations are the most useful simply because the concept 'knowing' is most often applied this way. But the unfocused derivations are also very useful, as we'll see later, when we discuss the difference between focused and unfocused concepts. Before we can discuss these differences, though, we need to acquire a little more background in the semantics of verbal concepts. Now, let's wrap up this section. In the above examples, we managed to derive the following useful, non-synonymic English verb lexemes from a single root morpheme: teach, study, review, learn, determine, know, and instruct We also created the basis for deriving the following non-verb lexemes: knowledge, education, pedagogy, erudition, determination, and several others that can be further derived, such as knowledgeable, pedagogical, heuristic, etc. We'll see how to do these later. In all, though, we will be able to derive a large number of non-synonymic English lexemes from a single root morpheme. Not bad, huh? Also, we were able to deal with concepts such as 'to keep/stay up-to-date', 'to self-teach', etc. for which English must use periphrasis, metaphors, or even idiom. 2.5.2 MORE ON DESIGN PHILOSOPHY Well, I hope you're still with me. And if you think I've gone off the deep end, then I suggest that you stop reading now, because I plan to carry this basic approach even farther. In self-defense, however, I might remind you that ALL of the constructions above have functional counterparts in natural languages. For example, Japanese, Swahili, Hungarian, Korean, Turkish, and many, many other languages use an inflectional morpheme to indicate causation ('kill' vs. 'die'). Romance languages use different auxiliaries and/or reflexive pronouns to indicate transitivity. Some Semitic languages (e.g. Arabic), quechuan languages (e.g. quechua), Malayo-Polynesian languages (e.g. Indonesian), and Bantu languages (e.g. Swahili) employ a rich set of inflectional or derivational morphemes to indicate transitivity, reflexivity, passivity, causation, reciprocity, etc. Iroquoian languages such as Cherokee and Mohawk explicitly mark their verbs to indicate transitivity, as well as whether the arguments of a verb are agent, patient, or both. [In fact, there are many languages from completely unrelated language families, such as Lakhota (Siouan), Acehnese (Sumatran), and Batsbi (Caucasian) that mark their verbs to indicate which argument of the verb is the semantic agent and which is the semantic patient in ways that are almost identical to what we are doing here.] Other languages, such as Swahili, Mokilese, Ainu, Fijian, Tagalog, and Somali, mark their verbs inflectionally and/or derivationally to change their argument structure (i.e. to add or change a thematic role required by the verb). Quite a large number of languages (e.g. Caucasian languages of Central Asia, Salishan languages of North America, and others) go so far as to mark a verb when it does not take an argument that it normally takes (e.g. "He is smoking" vs. "He is smoking a cigar"). Russian has two derived forms for most of its verbs which, when applied to many speech acts, captures the distinction we are making here between A/P/x-d and A/P/x-p verbs. Mayan and related languages mark their verbs for agency, transitivity, and stativity in a manner that is remarkably similar to what we are doing here. The difference, of course, is that no single natural language uses such a precise, comprehensive and consistent derivational morphology for ALL of its verbs. Keep in mind, though, that we are talking about the design of an ARTIFICIAL language, and the more regularity we build into it, the easier it will be to learn. Another thing to keep in mind is that this approach is derivational, NOT inflectional, and only the word designer has to spend the time needed to master the system. Thus, the language learner does not have to memorize inflections or derivations. Also, even though some of the above constructions may appear useless (a debatable judgement, in my opinion), we still managed to create several useful words from a single root morpheme. And, as we will see below, the actual number of useful creations will be MUCH higher. Finally, this rigorous approach to verb design has some interesting consequences that may not be immediately obvious. In using this kind of approach, you will find that many of the words you create have close (but not quite exact) counterparts in your native language. For example, the English verb "study" is the agentive/potential counterpart of the verb "learn", yet the following two sentences are pragmatically identical: 1. I went to MIT to learn math. 2. I went to MIT to study math. By "pragmatic", I mean that a listener hearing one of the above sentences would not come away with a different understanding if he had heard the other. However, there are cases where the two words cannot be interchanged, as in the following examples: 3. As I got older, I gradually learned discretion. 4. *As I got older, I gradually studied discretion. In other words, even though the meanings of the two words can overlap, their prototypical interpretations are distinct. With the rigorous approach to word design discussed here, these distinctions will be enforced to the degree that the first two sentences above (1 and 2) would have distinctly different meanings - number 1 would imply passivity since the subject is NOT an agent, while number 2 would imply industry since the subject IS an agent. This approach has interesting and highly desirable consequences. If you do NOT use a rigorous approach to word design, and attempt to create a vocabulary that duplicates every nuance of every natural language, your vocabulary will be essentially infinite in size (also, the task will be quite impossible). If you create your vocabulary based only on what you know of English or some other natural language, your vocabulary will simply be the original in disguise, with all of its idiosyncrasies. If, however, you design your vocabulary using the semantic techniques discussed here (or something equally rigorous), you will not only maximize its inherent neutrality, but you will also make it much easier to learn. The penalty, of course, is that the words you create may not precisely overlap, in meaning and usage, their closest equivalents in your native vocabulary. But this lack of precise overlap is exactly what you ALWAYS experience whenever you study a different language. So, do you want to create a clone of an existing vocabulary with all of its idiosyncracies? Or do you want to maximize the neutrality and ease-of-learning of the vocabulary of your AL? You can't have it both ways. In fact, it is this aspect of vocabulary design that seems to frustrate so many AL designers, who feel that they must capture all of the subtleties of their native language. In doing so, they merely end up creating a clone of the vocabulary of their natural language. The result is inherently biased, semantically imprecise, and difficult to learn for speakers of other natural languages. It is extremely important to keep in mind that words from different languages that are essentially equivalent in meaning RARELY overlap completely. Fortunately, all of this does NOT mean that your AL will lack subtlety. In fact, with a powerful and predictable derivational morphology, your AL can capture a great deal of subtlety, and can go considerably beyond any natural language. The only difference is that, unlike a natural language, the subtleties will be predictable rather than idiosyncratic. Another major advantage of a system like this is that it will make the vocabulary building process faster and easier. In effect, we will be able to design our vocabulary by using a 'back door' approach; i.e., we will start with a powerful derivational system (of which we've only seen a tiny part so far), and iteratively decompose words from a natural language and apply all possible derivations to the resulting root morphemes. In doing so, MANY additional useful words will be automatically created, making it unnecessary to decompose a large fraction of the remaining natural language vocabulary. In the above example, we started with the verb "to know" and derived several useful words from its basic root morpheme. This is only the tip of the iceberg. The rest of this monograph will illustrate how to derive many more additional words from the same root morpheme. 2.5.3 FROM BASIC VERB TO NOUN The sample morphology I am using here requires that all verbs end with "-si". Let's extend the morphology so that all nouns end with "-da", and so that any verb can be converted to a noun by simply changing the final "-si" to "-da". The question then becomes: how do we interpret the result? Here is the rule that I feel is most productive: When converting a basic verb to a noun, the noun will represent a PROTOTYPICAL GENERIC SUBJECT of an event indicated by the verb. Here's a short list of the most useful derivations (I leave the detailed analyses as an exercise for the reader): A/P/F-d: "teyokoda" = 'teacher' A/P/F-p: "teyonioda" = 'instructor' AP/F-p: "teyomida" = 'student' P/F-s: "teyoda" = 'knower', 'the one in the know' P/F-d: "teyododa" = 'learner' It would also be useful to have passive-like forms such as 'generic-someone- who-is-taught' = "pupil", 'generic-something-which-is-learned (without agency)' = "experience", 'generic-something-which-is-taught' = "subject/course/ curriculum", 'generic-something-which-is-known' = "knowledge", etc. To obtain these, we can apply a class-changing morpheme to the verb to make it passive, and then change the final "si" to "da". I'll have more to say about how to do this later. An instrumental derivation could be used to create a 'generic-instrument- used-to-teach' = "teaching materials", 'generic-instrument-used-to-maintain- knowledge' = "review materials", 'generic-instrument-used-to-teach-oneself' = "self-study materials", etc. Later, I'll have much more to say about these additional derivations. 2.5.4 FROM BASIC VERB TO ADJECTIVE Again, let's extend the morphology so that all adjectives end with "-no", and so that any verb can be converted to an adjective by simply changing the final "-si" to "-no". The question again becomes: how do we interpret the result? Here is the rule that I feel is most productive: When converting a basic verb to an adjective, the adjective will represent the prototypical QUALITIES of a generic subject, expressed attributively. This meaning can be best paraphrased as "having the attributes of one who VERBs or of something which VERBs". Here's a short list of the most useful derivations: A/P/F-d: "teyokono" = 'having the attributes of one who teaches' = 'teaching' (literally: 'having the attributes of one who causes others to increase in knowledge') (example: "He was a teaching locksmith") AP/F-p: "teyomino" = 'having the attributes of one who studies' = 'student/studying' (literally: 'having the attributes of one who attempts to cause self to increase in knowledge') (example: "He was a student geologist") AP/F-d: "teyosuano" = 'self-taught' P/F-s: "teyono" = 'knowing', 'in the know' It is important to note that the use of present participles (e.g. "teaching") and past participles (e.g. "self-taught") to represent the actual meanings is misleading, because English participles have strong implications of tense and aspect. For non-participial renderings, this is not a problem as in "a student geologist". Whereas "a studying geologist" carries an implication of present tense and imperfective aspect that should not actually exist in the adjective. Also, for similar reasons, do not confuse adjectives with relative clauses. For example, a "student geologist" is not quite the same as a "geologist who studies/studied/etc" since the relative clause DEFINITELY specifies tense and aspect. As with nouns, it will be useful to perform class derivations to create even more words. For example, an instrumental derivation could be used to create 'quality-of-a-generic-instrument-used-to-teach' = "pedagogical", 'quality- of-a-generic-instrument-used-to-study' = "heuristic", etc. We'll discuss these additional derivations later. 2.5.5 FROM BASIC VERB TO ADVERB To continue along the same lines as above, we will adopt the rule that all adverbs end with "-pe". However, before we can put this to use, we must first digress for a while and discuss the semantics of case tags and adverbs. 2.5.5.1 FROM BASIC VERB TO VERBAL MODIFIERS In this section, I would like to discuss the semantics of adverbs (especially those that correspond to English adverbs that end in "-ly") and most case tags (such as English prepositions, Japanese postpositions, Hungarian case inflections, etc.), and I will try to show how verbs can be converted to adverbs and case tags. The final result will be a system that can replace many complex, idiosyncratic and periphrastic constructions of natural languages with constructions that are syntactically simple and semantically transparent. 2.5.5.2 SEMANTICS OF CASE TAGS AND RELATED ADVERBS First, let me illustrate how verbs can, in fact, represent the semantics of English prepositions, adverbs, and particles by giving examples from other languages. In these languages, some verbs are actually used in the same way as English prepositions, adverbs, and particles. Consider the following from Vietnamese: (1) Toi di lai nha bang. I go to bank I'm going to the bank. (3) Nha bang o Hanoi... bank in Hanoi The bank in Hanoi... In the first example, the word "lai" is actually the verb 'to come'. When used transitively, it takes a destination as a direct object (like the English verb 'to enter'). In the second example, the word "o" is actualy the verb 'to be located at' and takes a location as a direct object. (Thus, the second example could also stand alone as a complete sentence meaning 'The bank IS IN Hanoi'.) Many other languages, such as Igbo, Ewe, Twi, and Yoruba (Niger-Congo languages of west Africa), Indonesian, Chinese, Cambodian, and many pidgins and creoles have similar constructions. Also, these constructions are not limited to locatives. In Chinese, for example, the word "yung" is the verb meaning 'to use'. It is also the preposition meaning instrumental 'with', as in the sentence "He broke the window WITH a hammer". It's also possible to create adverbs, particles, and completely new verbs in this manner. In Hindi, for example, "to run go" means 'to run away', and "to cook take" means 'to cook for oneself'. In Yoruba, "to carry come" means 'to bring', and "to carry go" means 'to take away'. Linguists have a name for this type of construction, in which two or more verbs are linked without the use of coordinating conjunctions or subordinators. They are called _serial verbs_. There are two major types of serial verb constructions: the events indicated by the verbs are either simultaneous or consecutive. In this discussion, we are only interested in the first category, where the two verbs represent events that occur simultaneously. Other useful serial verb constructions are those in which two or more verbs are linked, all taking the same subject and object. In these cases, the lack of a conjunction or subordinator often implies a certain 'immediacy'; i.e., that the event is a single entity, rather than a combination of unrelated or sequential events. Some languages, such as Chinese and Yoruba, allow any combinations that make semantic sense, and even allow noun phrases to split the verbs, creating an effect similar to relative clauses, but where the events indicated by the verbs are often much more tightly linked. Note that these types of constructions are not idiomatic - they are actually quite productive and their meanings are predictable from syntax and context. What most serial verb constructions have in common is that they are taken by speakers as representing parts of the same event. English has a few verbs that can be used in this way, such as "to go look", "to come see", "to let go", "to stir-fry", and "to test-fly" but note that the first two represent consecutive events, which is not what we are interested in here. Most of the time, English uses participles to achieve a simultaneous effect. Here are some examples, where the first sentence of each triplet indicates simultaneity: The child ran screaming to his mother. vs. The child who ran to his mother was screaming. vs. The child who was screaming ran to his mother. The man woke up shivering. vs. The man who woke up was shivering. vs. The man who was shivering woke up. The boy stumbled, knocking over several chairs. vs. The boy who stumbled knocked over several chairs. vs. The boy who knocked over several chairs stumbled. The girl slept, dreaming of unicorns. vs. The girl who slept dreamt of unicorns. vs. The girl who dreamt of unicorns slept. What seems to be happening here is that the participial phrase is more closely linked to the verb rather than to the noun it ostensibly modifies. As a result, we can create what are essentially compound verbs without subjects, and the results make perfectly good sense: to run screaming to wake shivering to stumble knocking over several chairs to sleep dreaming of unicorns In effect, the words "screaming" and "shivering" behave exactly like adverbs, and the words "knocking over" and "dreaming of" behave exactly like case tags (i.e. English prepositions) that introduce phrases that modify the verb. Thus, we should be able to create adverbs and case tags from verbs by applying the same semantic logic. Here's are some examples: I broke the window using a hammer. I broke the window with a hammer. to use: A/P-s The kids ran, crossing the road. The kids ran across the road. to cross: AP/F-d They came, tagging along (i.e. accompanying an unspecified focus). They came along. to tag along: AP-s The army positioned itself, surrounding/encircling the town. The army positioned itself around the town. to surround: AP/F-d or P/F-s or AP/F-s The car moved slowly, backing up. The car moved slowly backwards. to back up: P-d He visited his parents, staying three days. He visited his parents for three days. to stay: P-s or AP-s Additionally, if English had a verb like Vietnamese "o", Chinese "zai", Cambodian "niw", or Hausa "yana" (all of which mean 'to be located at or in'), we could create the locative senses of the prepositions "in" and "at" from it. For example, if the English word "bain" meant 'to be located in/at', we would have: The children were playing, baining the backyard. The children were playing in the backyard. to bain: P/F-s This really should not be all that strange to English speakers, since this is essentially how several English prepositions came about; i.e. a combination of the verb 'to be' plus a locative morpheme. Some examples are "before", "behind", "between", "beneath", etc. Some adverbs were also formed in this way, such as "below" and "beyond". In summary, speakers of languages with serial verb constructions effectively make up new 'prepositions' as they are needed. If a preposition with a desired literal meaning is not available, English speakers will either use existing prepositions metaphorically, or will use participial constructions as illustrated above. In this monograph, we will implement a system that has the flexibility of the serial verb constructions (but which is semantically and morphologically precise), and thus avoid the need for potentially untranslatable metaphor. 2.5.5.3 DESIGNING CASE TAGS AND RELATED ADVERBS As an example of the adverb/case tag creation process, let's continue where we left off when we started this digression, and create a set of adverbs and case tags from the state concept of 'knowledgeable'. As mentioned earlier, we will do this by changing the final "-si" of the verb to "-pe". Those whose verb forms do NOT take direct objects will become adverbs, and those which DO take direct objects will become case tags (i.e. English prepositions) adding a new oblique argument to the main verb. Thus, in effect, the case tag will LINK its argument to the verb. In the following examples, I will use English for all words except the new case tag/adverb. I will also use English word order. Here are the results: A/P/F-s: "teyotuepe" = 'keeping (someone else) current in' e.g. The company spends a lot of money teyotuepe its employees the latest technology. [Note that "teyotuepe" has two objects. Thus, there is no need for the preposition "in".] A/P/F-d: "teyokope" = 'teaching (someone) (something)' e.g. The man stood in front of the class teyokope the boys geometry. A/P/F-p: "teyoniope" = 'instructing (someone) (in/on something)' e.g. He spoke softly teyoniope the audience the new company policies. AP/F-s: "teyofipe" = 'reviewing', 'keeping oneself current in' e.g. They spent the night at John's house teyofipe the lessons for the next day's exam. AP/F-d: "teyosuape" = 'teaching oneself (something)' e.g. He stayed up late every night teyosuape French. AP/F-p: "teyomipe" = 'studying (something)' e.g. He stayed at that university for three years teyomipe physics. P/F-s: "teyope = teyomape" = 'knowing (something)' e.g. Joe quietly left the room teyope he would be called on next. P/F-d: "teyodope" = 'learning (something)', 'coming to know' e.g. He watched their activity for three hours teyodope valuable information. P/F-p: "teyoguipe" = 'being exposed to knowledge of (something)' e.g. He lived in a jungle teyoguipe how to survive. In all cases, note how the derived case tag or adverb modifies the whole sentence, just as if it were an oblique argument of the main verb. Note also that the case tag or adverb is usually tightly bound to one of the core arguments of the main verb, and that this core argument is predictable from the argument structure of the case tag. For example, in the sentence: Joe quietly left the room teyope (= 'knowing') he would be called on next. the subject of the case tag "teyope" is P and links to the patient of the main verb "to leave" which itself is AP/F-d. Thus, the effective subject of the case tag "teyope" is "Joe". And in the sentence: The man stood in front of the class teyokope (= 'teaching') the boys geometry. the subject of the case tag "teyokope" is A and links to the agent of the main verb "to stand" which is AP-s. Thus, the effective subject of the case tag "teyokope" is "the man". If the subject of a case tag is AP, it will link to the agent of the main verb; i.e., the effective subject of an AP case tag or adverb will always be the agent of the main verb. This is simply because agents are inherently more salient than patients or foci. If the main verb does not have an agent, then it will link to the patient. Finally, it is important to note that linkage to an argument of the main verb is common but not universal. Since it is possible for the argument of a verb to be an event (as opposed to an entity), it is also possible that a case tag can provide a linkage between its argument and the 'event' represented by the main verb. In other words, the case tag could link to the main verb PLUS its core arguments rather than to just one of its arguments. We'll have more to say about this later. In this section, we discussed how to convert existing verbs into adverbs and case tags. Later, we will discuss how to systematically create the many case tags required by a language, such as those needed to represent English prepositions. 2.5.6 DESIGNING ACTION VERBS All of the previous examples involved patient-oriented root concepts; i.e. the derivations were applied only to STATE verbs. In doing so, we always described the semantics of the derivations in terms of the ongoing or final state of the patient. The semantics of ACTION verbs, however, are quite different, since agent-oriented root concepts place emphasis on what the agent is doing rather than on what the patient is experiencing. Thus, a paraphrase of the semantics of a state verb will be different from a paraphrase of the semantics of an action verb. The following chart lists and clarifies these differences: State verbs: -s -> The patient experiences a steady state -d -> The patient experiences a change of state -p -> The patient potentially experiences a change of state If a state verb has an agent, then the agent is directly responsible for the event. Action verbs: -s -> The agent does something that maintains the patient in an unspecified steady state. (The agent is in control of the patient.) -d -> The agent does something that changes the state of the patient, but the final state is not explicitly indicated. -p -> The agent does something that potentially changes the state of the patient. We've already spent a lot of time on the analysis of state verbs. Let's briefly look at a few action verbs: A/P-d: "to kick" Agent 'kicks', having an unspecified effect on the patient. A/P-p: "to kick at" Agent 'kicks', attempting to affect the patient. A/P/F-p: "to tell" Agent 'speaks' something (= the focus), potentially having an unspecified effect on the patient. A/P/F-p: "to sing" Agent 'sings' something (= the focus), attempting to affect the patient. A/P/F-s: "to sing" Agent 'sings' something (= the focus), and DEFINITELY controls or maintains the patient in an unspecified steady state. A/P/F-d: "to sing" Agent 'sings' something (= the focus), and DEFINITELY caused the patient to experience an unspecified change of state. Of the last two examples, the "-s" form would be used to emphasize the control the agent had over the audience while singing, as if the audience were somehow mesmerized. The "-d" form would emphasize the change that the singer causes the listener to experience. For example, if the "-d" form were used in the sentence "She sang the child a lullaby", it would imply that the child did, in fact, fall asleep. In other words, the "-s" form implies successful control over the patient, while the "-d" form implies successful change of the patient. Note that these not-so-subtle distinctions can be made easily and with perfect regularity using the semantic system I am proposing here. However, to make these distinctions in a natural language like English requires either explicit elaboration or idiosyncratic periphrasis. At first glance, it would seem that non-agentive versions of action verbs would be useless. However, if we treat a P-x or P/F-x action derivation as the non-agentive equivalent of the AP derivation (as opposed to the non-agentive version of the A/P or A/P/F derivation), then it will have the added meaning that the activity was done by the patient, but was done accidentally, without actually trying, or without intent. In other words, the patient performed the activity but was not fully responsible or in control. In general, action concepts are not as productive as state concepts, because only a small number of argument structures can be usefully applied to an action concept. For example, an action verb must always have an agent. Thus, all of the agentless argument structures, such as P-s or P/F-d, are not going to be very useful (except as discussed above). Also, most of the physical actions, such as 'kick', indicate a very rapid change of state. Thus, the "-s" forms will not be very useful. [Actually, even 'kick' can have a useful "-s" derivation, as in "He kicked the ball all the way home". Here, "all the way" does not need to be explicitly stated because the "-s" form of the verb implies that he controlled the ball over a period of time by kicking it.] Fortunately, this will not have a serious effect on the overall productivity of the semantic system being presented here, because action verbs are quite rare compared to state verbs. Also, we will find that many verbs that appear to be actions at first glance, can actually be derived as "-p" state verbs. The net result is that we will be able to derive most verbs of a language from a surprisingly small number of roots. In spite of the above, I have no intention of abandoning action verbs, and, throughout the remainder of this monograph, I will be deriving many action verbs and doing further derivations on these verbs. (In fact, we will derive some very useful action verbs in the very next section.) Finally, there will be times when the orientation of a root concept is not clear. When this happens, do a complete derivation assuming BOTH orientations, and use the one that is most productive. For example, is 'work' an action or a state verb? I first assumed that it was inherently agent-oriented. However, very few useful derivations resulted, which I thought was strange for such a universal concept. However, by treating it as the patient-oriented state 'working/having a job/employed', I got several useful results: A/P-s: to employ The company employs 22 engineers. A/P-d: to hire The company hired 22 engineers. A/P-p: to recruit for We're recruiting for 22 engineers. P-s: to work/have a job He has a job at the butcher shop. P-d: to get a job He got a job last Friday. P/F-s: to work as He works here as a foreman. P/F-d: to get a job as He got a job as a foreman. AP-s: to be self-employed He's self-employed. AP/F-s: to be self-employed as He's self-employed as a plumber. [The English preposition "for", as in "he works FOR the company", would be the beneficiary case tag, which we will discuss later.] Finally, it's important to keep in mind that an AP version of an action verb is an _activity_ that emphasizes what the agent is doing without specifying an external patient. Thus, an AP/F-s version of the verb "to sing" would be used in a context such as "What's she doing? She's singing a little ditty". 2.6 GENERIC VERBS All of the basic verb classifiers represent concepts that are useful in their own right; i.e., they can be used to create useful verbs WITHOUT root morphemes to indicate states or actions. For this purpose, we can attach the terminator directly to the classifier. In effect, the classifier becomes a generic state root with the same class as the classifier. Here are examples of the most useful derivations: A/P-s: zoyasi - 'to keep/maintain' e.g. The warlord KEPT the village in a state of terror. [Note that the compound preposition "in a state of" marks what is called an oblique 'state' case role. This role is the static counterpart of the dynamic resultative case role that I mentioned earlier. I'll have much more to say about state case roles later.] A/P-d: pusi - 'to change' (transitive), 'to have an effect on', 'to affect' e.g. The divorce HAD AN EFFECT ON him. The wizard CHANGED the prince into a frog. [Here, the preposition 'into' marks the oblique dynamic 'state' case role; i.e. the _resultative_ case role.] A/P-p: cesi - 'to try to affect/change' e.g. He TRIED TO CHANGE her attitude, but she remained stubborn. AP/F-s: fisi - 'to keep oneself in an unspecified steady state with respect to', 'to stick with', e.g. He STUCK WITH the project. AP-s: panjisi - 'to persevere', 'to remain steadfast' e.g. He REMAINED STEADFAST/PERSEVERED until the end. AP-d: zasi - 'to change one's ways', 'to transform oneself', 'to cause oneself to change' e.g. He CHANGED HIS WAYS after the trial. AP-p: diusi - 'to try to change (oneself)' e.g. He TRIED TO CHANGE but remained a nerd. P/F-s: masi - 'to be in a relationship with', 'to be involved with', 'to have something to do with', 'to be in an unspecified steady state with respect to' e.g. Bill HAS SOMETHING TO DO WITH the new project. P/F-d: dosi - 'to enter a relationship with', 'to become involved with', 'to become associated with', 'to come to have something to do with' e.g. He BECAME INVOLVED WITH the new project. P-s: sesi - 'to be in an unspecified steady state', 'to experience something', 'something is happening/going on with P' e.g. SOMETHING'S HAPPENING/GOING ON WITH Bill. P-d: piasi - 'to change' (intransitive), 'to undergo a change' e.g. The office CHANGED since I was last here. Now, we can also derive generic ACTION verbs using a generic agent-oriented root morpheme. Our sample language will use the morpheme "-ze-" for this purpose; i.e., to represent a generic action. Keep in mind that action verbs emphasize what the agent is doing rather than what the patient is experiencing. Also, the focus of action verbs ALWAYS elaborates the event itself. Here are some of the more useful derivations of generic action verbs (in the sample language, "-ze-" will be A/P-s by default): A/P-s: zesi = zezoyasi - 'to run/operate/use/control' e.g. He CONTROLLED the company. John USED the hammer to break the window. [Note that the English word "use" often has a sense of 'abuse', especially if the object is sentient. The generic "zesi" does not have this sense, unless the context makes it inevitable.] A/P-d: zepusi - 'to do something to' e.g. Billy DID SOMETHING TO the cat. A/P-p: zecesi - 'to try (something)', 'to have a go at', 'to have at' e.g. He TRIED the stuck door. AP/F-s: zefisi - 'to do (something)', 'to keep oneself busy/occupied with (something)' e.g. He IS DOING his homework. AP-s: zepanjisi - 'to be doing something', 'to be busy/occupied' e.g. He IS BUSY right now. As we will see later, many of the above generic verbs can undergo additional derivation to produce some very useful words. 2.7 GRAMMATICAL VOICE So far, we've only talked about verbs in the active voice; i.e., where all of the arguments of a verb are present and appear in the proper order. For example, the A/P-d verb "to break" has an agent subject and a patient direct object. However, natural languages have many ways of changing the relative positions of the arguments of a verb in order to change their relative importance or topicality. Languages can also remove arguments from the argument structure, while implying that they still exist, and make the missing arguments either obliquely expressable or not expressable at all. Finally, languages can also incorporate normally oblique arguments, making them part of the argument structure of the verb. For example, consider the following: John broke the window. = active voice The window was broken. = passive voice, implied agent The window was broken by John. = passive voice, oblique agent The window was broken with a hammer. = passive voice, oblique instrument, implied agent A hammer broke the window. = incorporated instrument, agent cannot be expressed at all (*by John), new structure is something like I/P-d, where I = instrument. Those windows broke easily. = middle voice, implied agent, agent cannot be expressed at all (*by John). The windows broke. = P-d verb. This is sometimes confused with middle voice. In the system described in this monograph, this verb is a basic verb and the example is in the active voice. No agent is expressed or implied. John is the breaker (of the window) or John did the breaking (of the windows). = anti-passive (this is an approximation - English does not have a true morphological anti-passive construction). The agent alone is prominent. The patient loses its prominence but may be expressed obliquely. However, even when not expressed obliquely, a patient is always implied. The window broke John. (poetic license needed here) or The window, John broke it. = inverse voice (again, these are approximations - English does not have a regular inverse construction). Patient becomes subject, agent becomes object and MUST appear. Different languages handle these distinctions in different ways. As you can see from the above examples, English uses combinations of syntax, morphology, periphrasis, and even poetic license. Other languages are more regular, some using inflections for some voices, while others may use derivations or a combination of both. In addition, some languages allow the incorporation of other case roles into the argument structure of a verb. In fact, the number of possible voice variations among the world's languages is quite large. Since grammatical voice has different meanings to different people (with middle voice being the most confused/confusing), let me precisely define the meaning that I am using here. Specifically, A grammatical voice change starts with a basic verb and increases the topicality of one core argument relative to another. In the process, an existing argument may be deleted. A deleted argument may be expressed obliquely (e.g. passive) or may not be expressable at all (e.g. middle). However, the role of the deleted argument is ALWAYS implied. Thus, even though the original subject may not be expressed in a middle voice construction, it is still implied. For example, in "Mice kill easily", someone or something is responsible for the killing even though it cannot be expressed. In "Mice die easily", no agent is expressed or implied. Thus, the former is an example of a grammatical voice change, while the latter is not. An argument that increases in relative topicality is said to be _promoted_, and an argument that decreases in relative topicality is said to be _demoted_. Demoted arguments continue to play their original semantic roles, but are somehow less important or less involved. The following examples illustrate this effect: Active: The enemy bombed the city. Passive: The city was bombed. <- no agent or The city was bombed by the enemy. <- oblique agent Active: She sewed the dress. Anti-passive: She did the sewing. <- no patient or She did the sewing on the dress. <- oblique patient Although the number of possible voice combinations is large, there are a few that crop up often among the world's languages. Here are the most common ones: Active - transitive: The subject is slightly more important or topical than the object. Both must be expressed. This is by far the most common form used in almost all languages. [The only exceptions I know of are Fijian and the Salish languages of northwestern North America. In these languages, all transitive verbs are derived by addition of an affix to the intransitive form. Also, in Fijian, the most commonly used verb form is active INTRANSITIVE. (Shades of Sapir-Whorf!)] Passive: The original object becomes the subject and becomes considerably more topical than the original subject. The original subject is no longer part of the verb's argument structure, and does not have to be expressed. However, it is always implied and may be expressed obliquely (in English, typically using the preposition "by"). Middle: The original object is made more topical and becomes the subject. The original subject is deleted from the verb's argument structure. The original subject is implied, but is so unimportant that it can NOT be expressed obliquely. Only the original object and the event indicated by the verb remain important. Anti-passive: The subject is made considerably more salient than the object. The original object is no longer part of the verb's argument structure, and does not have to be expressed. However, it is always implied and may be expressed obliquely. Inverse: The arguments of the active verb are simply reversed. The original object becomes the subject, gaining in importance; and the original subject becomes the object, losing importance. Unlike passive, the original subject is not oblique and MUST appear. Keep in mind that the above are generalizations. Individual languages vary both in the ways that the various voices are implemented as well as in their semantics. Also, keep in mind that the list contains just the most common voice systems. Many other combinations are possible, especially those involving normally oblique case roles. As we saw above, a language like English, which does not have this ability, must resort to complex and idiosyncratic constructions to achieve the same effect. Always keep in mind, though, that a voice change simply re-arranges the topicality of some of the participants in a sentence. Our goal should be to achieve the same results in a consistent and easy-to-understand manner. Also, English rarely uses the same strategies to handle these needs. For example, an effect similar to (but not exactly the same as) that of passives and anti-passives can be achieved by using impersonal constructions: "Johnson punched someone" (anti-passive) or "They don't make good cars anymore" (passive). An effect similar to (but not exactly the same as) the inverse can often be accomplished by fronting or left dislocation, as in "(As for) the car, John wrecked it". However, true inverse effects can sometimes be obtained by periphrasis, as in: Active: A 10-year old can read the book. Inverse: The book is readable by a 10-year old. Active: The cup is full of water. Inverse: Water fills the cup. Finally, inverse and middle effects are sometimes achieved in English by using completely different root morphemes, as in "I enjoyed the show" vs. "The show pleased me" (inverse), or by use of metaphor or idiom, as in "He remembered the answer" vs. "The answer came to mind" (middle). [Incidentally, the inverse voice comes in two varieties. In the first, which is sometimes called a _semantic inverse_, an inverse operation may be required in order to properly assign case roles to the arguments of a verb. Semantic inverse constructions are especially common in the native languages of North America. For example, in Plains Cree (Algonquian), a more animate argument is inherently more topical than a less animate argument, and neither word order nor case marking of nouns can change the interpretation. Thus, if "man" and "dog" appear as the main arguments of the verb "bite", then it will always be interpreted as "man bites dog", regardless of word order. An inverse marking on the verb simply reverses the relative topicality, making "dog" more topical than "man", and is REQUIRED to obtain the sense "dog bites man". I do not consider this usage a true voice alteration. It is simply an uncommon way of marking semantic case roles in a sentence. Similarly, some Sino-Tibetan languages have an inverse voice based on the relative topicality of 1st, 2nd, and 3rd person, rather than animacy. Note though, that although an inverse operation may at times be required, it can also be used when it is not required in order to achieve the changes in topicality that we are describing here. In these cases, such an operation is called a _pragmatic inverse_. True pragmatic inverses can be found in languages such as Maasai (Nilo- Saharan), Sahaptian languages (Penutian, western North America; e.g. Nez Perce), Caucasian languages (e.g. Georgian), and Chamorro (Austronesian, Guam). (In fact, Maasai and Sahaptian languages have both semantic AND pragmatic inverses.) Finally, a combination of word order changes and direct case marking of nouns can sometimes be used to achieve an inverse effect (e.g. Korean). However, other languages which have this ability (e.g. Russian) frequently use it for quite different purposes. As for true inverse systems, recent research indicates that such systems are actually much more common among the world's languages than had been previously supposed.] 2.7.1 IMPLEMENTATION OF A GRAMMATICAL VOICE SYSTEM Most European languages (including English) use cumbersome rules involving auxiliaries, participles, reflexives, context, word-order, and even complete lexical changes to indicate voice. More heavily inflected languages (Arabic, Latin, Japanese, Ainu, etc.) use the very simple expedient of inflecting the verb for most indications of voice. Many South American lowland languages and some isolating (i.e. uninflected) languages such as Chinese and Vietnamese do not have a formal morphology or syntax to cover voice, although they can achieve similar effects via explicit topicalization and/or periphrasis. Finally, other languages such as the Bantu languages of Africa (e.g. Swahili) and Austronesian languages (e.g. Indonesian) use derivational morphemes (which is essentially what we are doing here) to achieve most voice effects. In other words, they create a completely different verb from the same root as the active verb, but the new verb has a different topicalization and argument structure. So, how should an artificial language implement grammatical voice? Ideally, we would like to create a system that can handle any voicing needs, while being both simple and consistent. I will say no more about syntactic approaches, since this monograph is strictly about morphology and semantics. (Besides, I do not feel that grammatical voice change should be implemented in syntax - syntax is not nearly as flexible as morphology.) Instead, I recommend that all grammatical voice changes be implemented using derivational morphology. One possible way to accomplish this would be to allocate a single _CLASS-CHANGING MORPHEME_ for each voice, which would be used in addition to the active classifier. The resulting verbs will, of course, have a different argument structure. In the sample language, here is the new morphology that I will use: Word ::= Root + ( Classifier ) + ( Class-Changing Morpheme ) + Part-of-speech Root ::= Morpheme Classifier ::= Morpheme Class-Changing ::= Morpheme Morpheme Morpheme ::= C D | C V { X } Part-of-speech ::= Terminator Terminator ::= da | no | pe | si C = any consonant (p, b, t, d, k, g, c, j, l, m, n, f, v, s, z, x) D = any diphthong (ai, au, eu, ia, ie, io, iu, oi, ua, ue, ui, uo) V = any vowel (a, e, i, o, u) S = any semi-vowel (w, y) X = extension = S V | C C V | = logical 'or' () = enclosed item is optional {} = enclosed item may appear zero or more times Lower case letters represent themselves For example, if the state root meaning 'closed/shut/unopened' is "-benzo-" (default class = A/P-d), then the word for the A/P-d verb 'to close/shut' is "benzopusi = benzosi". We can implement the other voices as follows: middle: -de- -> benzodesi e.g. Windows benzodesi easily = Windows close easily. passive: -nu- -> benzonusi e.g. The window benzonusi (by the thief) = The window was closed (by the thief). anti-passive: -ga- -> benzogasi e.g. The thief benzogasi (of the window) = The thief did the closing (of the window) or = The thief was the closer (of the window). inverse: -vi- -> benzovisi e.g. The window benzovisi the thief = The window - the thief closed it. where optional oblique arguments are shown in parentheses. In the above examples, the inverse paraphrase is only approximate, and actually increases the topicality of the fronted item more than it should. A better example of a true inverse effect in English would be: Active: John owns the book. Inverse: The book belongs to John. where "to belong to" should be thought of as a single complex verb (rather than as a verb plus the case marker "to"). Note that the second sentence is a true inverse of the first, and is only roughly approximated by the paraphrase "The book - John owns it". A useful notational scheme would be to put an implied case role in square brackets, with a plus "+" or minus "-" sign to indicate whether it can be expressed obliquely. Thus, middle: -de- changes A/P-x to P-x [-A] AP/F-x to F-x [-AP] P/F-x to F-x [-P] passive: -nu- changes A/P-x to P-x [+A] AP/F-x to F-x [+AP] P/F-x to F-x [+P] anti-passive: -ga- changes A/P-x to A-x [+P] AP/F-x to AP-x [+F] P/F-x to P-x [+F] inverse: -vi- changes A/P-x to P/A-x AP/F-x to F/AP-x P/F-x to F/P-x where "-x" represents either "-s", "-d", or "-p". For verbs that take three arguments, I feel it is most useful to do the following: middle: -de- changes A/P/F-x to P/F-x [-A] e.g. *The students taught French easily. [This is ungrammatical in English with the intended meaning, but grammatical in the sample language.] passive: -nu- changes A/P/F-x to P/F-x [+A] e.g. The students were taught French (by Mr. Johnson). anti-passive: -ga- changes A/P/F-x to A/F-x [+P] e.g. He shouted obscenities (at the crowd). [Note that the English verb "to shout" is inherently anti-passive. Thus, in the system proposed here, we would start by creating an A/P/F-p version of this verb, and then perform an anti-passive operation to derive an exact equivalent of the English verb "to shout".] inverse: -vi- changes A/P/F-x to P/A/F-x e.g. The student - John taught him geometry. In addition, some languages, such as Latin, Shona (Bantu), Turkish, Classical Greek, and German allow _impersonal_ passives, in which an intransitive verb is passivized becoming a zero-argument verb. For example, the AP-s verb "to run" could be passivized to 0-s [+AP] or 0-s [-AP], depending on the language, where "0" is used to indicate that the verb has no arguments. It would be interpreted as something like 'running took place' or 'there was running'. A verb like P-d "to grow" could become 0-d [+/-P], and would mean something like 'growing took place' or 'there was growth'. You could allow your language to use the same voice changing morphemes to achieve similar results. Don't let your native language be a straightjacket! Try to accomplish as much as possible with as small a vocabulary as possible, and with as simple a syntax as possible. Another useful derivation would be to take an A/P/F verb and reduce the topicality of the THIRD argument. (Remember, the anti-passive discussed above reduces the topicality of the SECOND argument.) In the sample language, this derivation will be as follows: anti-anti-passive: -miu- changes A/P/F-x to A/P-x [+F] We'll see examples of how to use this later. Grammatical voice alterations are also useful for creating speech act verbs which never take a patient as a direct object, such as the A/F-s [+P] verb "to dictate", as in "He dictated the letter (to his aide)". For verbs like these, we can create a verb that DOES allow a direct object patient, and promote a focus to direct object by means of the anti-passive alteration. We can also derive exact equivalents of intransitive English verbs such as "to complain" (A-p [+P] [+F]) via a DOUBLE anti-passive voice alteration of the basic A/P/F-p form. For example, using the basic form, a sentence like "He complained me the new equipment" would be grammatical. A SINGLE anti-passive alteration would allow a sentence like "He complained the new equipment TO me", and a DOUBLE anti-passive alteration would create an exact equivalent of the English verb, as in "He complained TO me ABOUT the new equipment", where "TO" is an oblique patient case tag and "ABOUT" is an oblique focus case tag. In fact, this double anti-passive operation is so useful that I have created a single voice-changing morpheme to implement it in the sample language, rather than using the anti-passive morpheme twice: double anti-passive: -kua- changes A/P/F-x to A [+P] [+F] At the same time, we can introduce a morpheme to perform a double PASSIVE: double passive: -jau- changes A/P/F-x to F [+A] [+P] For example, the single passive would be used to create the equivalent of the English word "bespoken", while the double passive would be used for the word "spoken". The double passive will be especially useful with speech act verbs. Finally, I know of a few languages such as Finnish, Ika (Lower Niger), Breton, and Alabama-Koasati (Muskogean) which implement impersonal constructions in morphology. In an impersonal construction, an argument is REDUCED in topicality while the topicality of the other argument is left unchanged. Most languages, however, implement impersonal constructions using impersonal pronouns, as in the following English examples: THEY don't make cars like THEY used to. ONE should never assume anything. SOMEONE broke the window. THEY gave him SOMETHING for the pain. YOU/ONE should always drive carefully. If you decide to implement impersonal constructions in morphology, you will need class-changing morphemes to perform the impersonal SUBJECT alteration: X/Y-x -> 0/Y-x [-X] and the impersonal OBJECT alteration: X/Y-x -> X/0-x [-Y] Note the notational distinction between the impersonal subject alteration result "0/Y-x [-X]" and the middle voice alteration result "Y-x [-X]". Later, I will discuss how to implement impersonal constructions that use impersonal pronouns. 2.7.2 MORE ON MIDDLE VOICE The passive, anti-passive, and inverse voices are easy to understand, and I'll say no more about them. Middle voice, however, is so frequently confused with basic intransitivity that I'd like to say a little more about it. English does not have a formal morphosyntax for middle constructions, unlike many other languages (Persian, Swahili, Basque, Somali, Hausa, Turkish, and many, many others - middle forms in these languages often go by other names, such as statives or agentless passives, but they often function semantically as middles). English does not even have a reflexive clitic construction, as do several other European languages, which often performs additional duty for middle voice. This is unfortunate, since, as we will see, it can be extremely useful and productive. English SOMETIMES allows an active verb to be used without modification in a middle construction, as long as the context forbids an active interpretation. Thus, we can say "The joke did not translate well" or "The plane landed ten minutes ago". But even when the meaning is clear, English can be quite idiosyncratic as in "*The mountains see in the distance" or "*The boxes are covering in the storeroom". And in cases where context and semantics do not make it clear, English is forced to use periphrastic constructions, completely different words, metaphor, and even idiom. Consider the following examples: ACTIVE MIDDLE I see the mountains. *The mountains see. The mountains are in view. Thus, from the verb "to see", P/F-s, we can derive: "to be in view", F-s [-P] The gang terrorized the *The neighborhood terrorized for three neighborhood for three years. years. The neighborhood lived in a state of of terror for three years. Thus, from the verb "to terrorize", A/P-s, we can derive: "to live in a state of terror", P-s [-A] That woman buys caviar only *Caviar buys only when it's on sale. when it's on sale. Caviar sells only when it's on sale. Thus, from the verb "to buy", AP/F-d, we can derive: "to sell (intransitive sense only)", F-d [-AP] He threw the rock at the window. *The rock threw at the window. The rock went flying at the window. Thus, from the verb "to throw", A/P-d, we can derive: "to go flying (metaphorically)", P-d [-A] I remembered her face. *Her face remembered. Her face came to mind. Thus, from the verb "to remember", P/F-d, we can derive: "to come to mind", F-d [-P] He swallowed the pills *The pills swallowed with difficulty. with difficulty. The pills went down with difficulty. Thus, from the verb "to swallow", A/P-d, we can derive: "to go down", P-d [-A] And so forth. The number of possible examples is almost unlimited. Thus, English CAN deal with middle concepts, although the forms are usually highly irregular, unpredictable, periphrastic, and often either metaphoric or idiomatic. Finally, middle verbs are often confused with basic P-x state verbs (English- speaking linguists often make this mistake). The reason is that the patient is the subject of an intransitive verb, and it is often uncertain whether or not the transitive subject is implied. In languages which have a formal middle voice, however, there is never any doubt. Unfortunately, speakers of languages like English will have to be a little more careful. When in doubt, the basic P-x form should always be used instead of the middle form unless a transitive subject is clearly implied. Middle verbs are also often confused with reciprocals and reflexives because some languages (especially European languages) use the same forms for more than one voice. In the semantic system being proposed here, middles, reflexives, and reciprocals are completely different. [We will discuss reflexives and reciprocals in great detail later, in the section on class-changing morphemes.] 2.7.3 INCORPORATING OBLIQUE CASE ROLES Some natural languages can make almost any case role a subject or object of the verb (e.g. Malagasy, some Mayan languages, and most Philippine languages). In fact, among the Philippine languages, verbs almost always have an explicit morpheme that indicates the case role of the subject, and almost any case role can be promoted to subject. Many Bantu languages of Africa (e.g. Swahili) and some Australian languages (e.g. Dyirbal) allow an instrumental case role to be promoted to object. Many Bantu languages also allow a locative case role to be promoted to subject. Indonesian allows a beneficiary case role to be promoted to object. And so on. Obviously, the above system could be easily extended to add normally oblique case roles to the argument structure of a verb. However, I will NOT be doing this in the sample language for the following reasons: 1. It is extremely rare among natural langauges. 2. The number of possible combinations of argument position and case role is very large, and would require a large number of class-changing morphemes that would rarely be used. 3. Most languages that allow promotion of normally oblique case roles have special reasons for doing so. For example, many languages allow relativization of only certain core arguments, and thus a voice change is REQUIRED before other arguments can be relativized. For all of the above reasons, I feel that it is not necessary to implement grammatical voice changes that would promote normally oblique case roles to core positions. In fact, as we'll see later, ALL oblique case markers will be derived from verbs, and these verbs can be used directly instead of case tags (i.e. prepositions) when we need to increase the topicality of the associated role. Thus, while there must be a way to modify the relative topicalities of core arguments, there is simply no need to create special morphemes to promote normally oblique arguments. 2.7.4 NOMENCLATURE OF GRAMMATICAL VOICE There are two voice changing operations that demote an argument: passive and middle. A passive voice change demotes an argument but allows it to be expressed obliquely. A middle voice change demotes an argument but does NOT allow it to be expressed obliquely. If the prefix "anti-" is NOT used, the first argument (i.e., the subject) is demoted. If a single "anti-" prefix is used, the second argument (i.e., the first object) is demoted. If two "anti-" prefixes are used, the third argument (i.e., the second object) is demoted. The modifier "double" indicates that the same voice operation is performed twice. For example, an anti-middle demotes the second argument, and suppresses its salience so much that it cannot be expressed obliquely. An anti-anti-passive demotes the third argument and allows it to be expressed obliquely. And so on. 2.7.5 ENVIRONMENTAL VERBS One of the major uses of a grammatical voice change is to take an existing argument and make it implied but inexpressible. Since environmental verbs often have an implied but non-statable patient, they are ideal candidates for voice derivation from a more basic form. Consider the following (assume that the root morpheme for 'hot' is "-xau-", and that the default class is P-s): They heated the room. transitive = A/P-d = xaupusi 'to heat', 'to raise the temperature of' The room heated up. intransitive = P-d = xaupiasi 'to heat up', 'to rise in temperature' It heated up (outside), It got hotter out, It warmed up. 'it heated up', middle voice = 0-d [-P] = xaupiadesi The room is hot. 'to be hot', intransitive = P-s = xausi = xausesi It's hot (out). 'it is hot (in the implied environment)', middle voice = 0-s [-P] = xaudesi = xausedesi Note that it is also possible to create passive forms, such as: The room was heated up (by the janitor) => xaupunusi = P-d [+A] It got hotter (in the room) => xaupianusi = 0-d [+P] And so forth. Other environmental verbs can be formed in the same way, including verbs that explicity indicate the presence of some kind of physical entity, such as "to rain" or "to snow". [We'll have more to say about how to derive such verbs later.] 2.7.6 DISJUNCTS When using verbs, we must be careful not to confuse case roles. It is sometimes easy to mistake an event for a patient. Consider the following example: It's sad that John died. It is tempting to treat the embedded sentence "John died" as if it were the patient in a P-s state verb formed from the root meaning 'sad'. However, an event cannot be "sad" in the sense that it can experience sadness. What we are really describing are the feelings of the speaker (and perhaps others) towards the situation. Thus, when we say "it's sad that ...", we are really describing our feelings or beliefs about the situation. In effect, the speaker and those he may be speaking to are the real, implied patients. Thus, in a sentence like the above, the true patient is implied, and the mental state of the patient is 'focused' on the event indicated by the embedded sentence. Thus, the embedded sentence is the FOCUS of the main state verb. We can easily create a basic P/F-s verb meaning 'to be sad about', as in the sentence "Bill is sad about his parents' divorce". Using this basic verb, we can perform a middle voice alteration to create the F-s [-P] form meaning 'it is sad that'. It is also possible for an event to be the agent or cause of the sadness. For this, we would need an A/P-s verb, since the event itself causes the patient to be sad. Thus, we really have several possible forms, as illustrated below: A/P-s John's death makes (i.e. keeps) me sad. A/P-d John's death saddened me. P/F-s I am sad that John died. F-s [-P] Sadly, John died. F-s [+P] It's sad for everyone that John died. A similar analysis can be done using the state concept 'hoping': P/F-s I hope that I'll win. F-s [-P] Hopefully I'll win. where "hopefully" is actually a verb that takes a complete embedded sentence as an argument - it is NOT an adverb as in English. Words and expressions like these are called _disjuncts_, and many other examples can be derived in the same way: "to presume" -> "presumably", "to be interesting" -> "interestingly", "to be possible" -> "possibly", "to be incidental" -> "incidentally, by the way", "to be necessary" -> "necessarily", "to be fortunate" -> "fortunately", etc. Also, many of these concepts reflect the ATTITUDE of the patient about an event. Thus, they are essentially _modal_ in nature. I'll have a lot more to say about _modality_ later. 2.7.7 GENERIC VOICE DERIVATIONS Voice derivations can also be performed on generic verbs. Again, we will use the generic action root "-ze-" for generic action verbs, and will assume a generic state root when a root is not present. The results are very useful: Anti-passive A-d [+P] "pugasi" - 'to work/cause a change' e.g. The new job CAUSED A CHANGE (in his behavior). Middle P-s [-A] "zedesi = zezoyadesi" - 'to be under control' e.g. The reactor IS now UNDER CONTROL. Inverse P/A-s "zevisi = zezoyavisi" - 'to be under the control of' e.g. The project IS now UNDER THE CONTROL OF the engineering department. Anti-passive A-s [+P] "zegasi = zezoyagasi" - 'to be in control/charge' e.g. John IS IN CONTROL (of the project). Middle P/F-s [-A] "zetuedesi" - 'to experience' e.g. John EXPERIENCED a new kind of freedom. Inverse F/P-d "zedovisi" - 'to befall' e.g. Tragedy BEFELL the entire crew. Anti-middle A/F-p [-P] "zenioxisi" - 'to try/attempt' e.g. John TRIED to open the door. In the last example, the patient cannot be obliquely expressed because it appears within the focus; i.e. because it is a direct patient of the embedded sentence that elaborates the 'trying' event. [Remember, the focus of an action verb must always elaborate the event indicated by the verb.] Note that we also introduced the class-changing morpheme "-xi-" to perform the anti-middle voice operation. As we discussed earlier, the prefix "anti-" is used when the second argument of the verb is demoted. 2.8 INDIRECT CAUSATION AND VERBAL COMPLEMENTS In many of our verb derivations, we used the word "cause" in our paraphrases of the semantics of verbs which have an agent in their argument structure. Unfortunately, these paraphrases are approximate and often imply some distance between the agent and the event. However, I must emphasize that the agent argument of a verb is the entity that is DIRECTLY responsible for the event indicated by the verb. Thus, there is a definite semantic difference between 'kill' and 'cause to die', even though our paraphrases may imply otherwise. If we wish to intentionally put distance between an agent and an event, we must design words that are equivalent to English "cause", "make", etc. Consider the following sentences: He MADE his son wash the dishes. I HAD Bill deliver the package. He CAUSED his wife to have a miscarriage. In the above examples, the patient (if that is what it really is) cannot be expressed directly: *He made his son. *I had Bill. *He caused his wife. However, the English verb "to cause" can be used without this quasi patient: John caused the accident. Thus, these verbs indicate that an INDIRECT agent is responsible for an event which itself may have a DIRECT agent - the quasi patient is not at all a true patient of the verb "cause/make/have" (although it may be the true patient of the embedded sentence). Also, the English distinction between "cause", "have", and "make" is somewhat idiosyncratic. Semantically, there is no significant difference between them. [Actually, "to have" is a more polite version of "to make", but this distinction is not important to us here. We will discuss how to derive more polite forms of words in the section on register variations. More importantly, though, we will derive a verb with this sense of 'to have' in the section on _modality_.] The most neutral paraphrase of indirect causation is simply 'to cause to exist' or 'to cause to be real'. In our sample language, I will use the state root "-veya-" to represent the concept 'real/actual/existent' (default class = A/P-d). Here are some useful derivations: A/P-d: "veyasi=veyapusi" - 'to cause/make/create', 'to cause to come into existence', 'to cause to become real/actual' e.g. John CAUSED the accident. John MADE Billy wash the dishes. John MADE some apple cider. A/P-s: "veyazoyasi" - 'to insure/guarantee', 'to keep/maintain a reality' e.g. The contract GUARANTEED their compliance. P-s: "veyasesi" - 'to be real/actual', 'to exist', 'there be' e.g. The particles EXIST for only ten nanoseconds. THERE ARE ten people at the party. THE SITUATION IS SUCH THAT only ten people came. P-d: "veyapiasi" - 'to come into existence', 'there came to be', 'it came to be that' e.g. The new policy CAME INTO BEING after he resigned. THERE CAME TO BE fewer people willing to help. IT CAME TO BE THAT fewer people were willing to help. Note that English uses different forms depending on whether the focus is a noun phrase (i.e. an entity) or an embedded sentence (i.e. an event). Another type of indirect causation can be indicated by using a generic verb. As we mentioned earlier, the focus of an action verb MUST elaborate the event described by the verb. In effect, the agent is responsible for causing or bringing about the event. For example, in a sentence like "John told a joke", the word "joke" elaborates what John did; i.e. an elaboration of a 'telling' event describes the specific method or means used in the 'telling'. Thus, a GENERIC A/P/F-d action verb would indicate that the agent successfully affected the patient BY MEANS OF the focus. Now, if we used the unmodified A/P/F-d generic action form "zekosi", we could create a sentence like this: A P F John zekosi (the project) (he resigned). = John did something to the project by resigning. = John affected the project by resigning. This sentence sounds odd in English, but is perfectly acceptable in our sample language. To implement it differently, we could simply used a nominalized verb which indicates the process of "resigning": A P F John zekosi (the project) (his resignation). = John did something to the project with his resignation. = John affected the project with his resignation. We'll discuss how to create process nouns later. The anti-middle A/F-d [-P] form is also useful: A/F-d [-P] "zekoxisi" - 'to affect someone/something unspecified by means of' e.g. John CAUSED SOMETHING TO HAPPEN BY resigning. If we now perform an additional normal middle voice operation, we get: F-d [-A] [-P] "zekoxidesi" - 'something was affected by doing X' e.g. SOMETHING HAPPENED WHEN John resigned. This last example could have also been accomplished by means of a double middle construction; i.e. "zekodedesi". In fact, this double construction will be useful enough to allocate a single CCM: double middle -voi- changes A/P/F-x to F-x [-A] [-P] Thus, the new word is "zekovoisi". Note how we neatly capture some important functions which are often highly idiosyncratic in natural languages. Finally, basic verbs which already have a focus as a direct object can also take an embedded sentence as an argument, as the following examples illustrate: John wanted the book vs. John wanted Bill to leave I saw the mountains vs. I saw them working They liked her vs. They liked her portrayal of Juliet We know the answer vs. We know he wants her to buy the car Note that the English embedded sentences are idiosyncratic in that they require either infinitives, participles, nominalizations, or complete finite sentences, depending on the particular verb. By using an embedded sentence with the same form as a normal sentence (i.e. a complete finite sentence), you can achieve the same effect with a simpler morphology and syntax. Here is how the above examples would look (the complete embedded sentence is in parentheses): John wanted (Bill leave). I saw (they were working). They liked (she portrayed Juliet). We know (he wants (she buy the car)). They seem awkward in English, but they're linguistically sound, syntactically simpler, and totally lacking in idiosyncracy. Also, this approach is used in MANY natural languages. [Incidentally, an embedded sentence which appears as the argument of a verb (subject or object) is called a _complement_.] 2.9 IS IT REALLY AN ACTION VERB? Some verbs that appear to be actions, especially speech acts, imply an attempt to put the patient in a particular, known state. For example, the verb "to apologize" indicates that the agent is attempting to cause the patient to be 'forgiving'. Similarly, the verb "to lie" indicates that the agent is attempting to cause the patient to be 'deceived'. Since we know the intended final state, we might be tempted to use it in converting an action concept to what is effectively a state verb. For example, we could perform the following derivations: A/P-p: to scold => A/P-d: to shame, to cause remorse P-s: to be ashamed, remorseful A-p [+P]: to lie => A/P-d: to deceive A-p [+P]: to apologize => A/P-d: to achieve forgiveness from And so forth. In other words, the "-p" verbs imply a POTENTIAL change of state while the "-d" verb indicates an ACTUAL or SUCCESSFUL change of state. However, even though the above derivations imply it, the root meanings of the "-p" and "-d" verbs are not really the same. For example, an effective scolding is not the only thing that can cause shame or remorse; an effective lie is not the only thing that can cause one to be deceived; and an effective apology is not the only way to achieve forgiveness. In other words, the action and the state are not truly reversible. One problem here is that the English state verbs do not imply the use of speech, whereas the speech act verbs do. This suggests that a better approach would be to derive what appears to be a speech act from a STATE concept and add a morpheme to indicate that the event involves speech. Here is a sample derivation using the concept of 'forgiving': [Note that the basic state here is 'forgiving' - NOT 'forgiven'. A 'forgiving' state is a true mental state. However, the state of being 'forgiven' is not a true state since it merely indicates the attitude or feeling of someone ELSE towards the forgiven person. Note also that the concept of 'forgiving' implies a relationship with the person forgiven. Thus, the forgiven person is the focus, NOT the patient.] P-s: to be a forgiving person, to be forgiving/magnanimous AP/F-d: to forgive (i.e. to cause oneself to enter a forgiving state focused on someone else) A/P/F-p: to attempt to cause the patient to be forgiving towards the focus. AF/P-p: to attempt to achieve forgiveness for oneself from someone. I.e., the agent attempts to cause the patient to be forgiving to the agent. AF-p [+P]: to attempt to achieve forgiveness - 'forgiver' can be expressed obliquely Now, if we add a morpheme to the AF-p [+P] derivation which indicates that the verb implies the use of speech, we get: speech morpheme + AF-p [+P]: to apologize Note that we had to introduce the new and unusual case role "AF". This can be implemented using a class-changing morpheme similar to what we did earlier when we discussed grammatical voice operations. In this case, the new class- changing morpheme would simply perform a _reflexive_ function, indicating that the agent and focus are the same entity. Finally, on first glance, it appears that the verb "to apologize" has a focus, as in "I apologized for my rude behavior". However, the preposition "for" does not introduce a true focus - it actually marks the REASON case role. [We'll have more to say about reflexives and the reason case role later. Also, the actual implementation of the speech morpheme mentioned here will be discussed later in the section on modality.] The above works well with the concept 'forgiving', and should also work well for other verb pairs such as 'lie/deceive'. In other words, many words which apear to be actions (especially speech acts) can be derived from state concept root morphemes, and are not true actions. However, this will not work in all cases. For example, if we tried to derive the verb "to scold" from the state concept of 'ashamed/remorseful', we end up with: A/P-d: to shame/embarrass A/P-p: to attempt to embarrass speech morpheme + A/P-p: to scold Unfortunately, the English word "scold" implies much more than simply an attempt to cause shame using speech. It also implies that the patient is either young or immature, and that the speech act is somewhat sharp and loud. This does not mean, however, that the above derivation is useless. In fact, it is a very good derivation of the English word "reprove". This still leaves us with the problem of words like "scold". Since the concept behind 'scold' is complex, and since the word is a very useful one, we would probably want to create a unique action root for it. And, as with other action derivations, the "-s" and "-d" forms will indicate successful control or change of the patient: A/P-p: to scold A/P-d: to effectively and successfully scold A/P-s: to keep under control by scolding ("to henpeck"???) In other words, when an action root is used with a "-s" or "-d" classifier, the resulting state can be defined as whatever complex and/or vague state that a patient experiences when the action is effective or successful. In many cases, we'll find that such derivations can be quite useful, often replacing periphrastic, metaphoric, or even idiomatic expressions in English. Here are some examples: A/P-p: to speak to A/P-d: to have a good/productive talk with A/P-p: to hit A/P-d: to give something a good whack A/P-p: to scold A/P-d: to set someone straight, to give someone a good dressing-down A/P-p: to kick A/P-d: to give something a good, swift kick AP/F-s: to play (a game) AP/F-d: to win A/P/F-p: to tell someone something A/P/F-d: to inform someone of something A/P/F-p: to call someone something A/P/F-d: to name/christen someone something [Actually, "to call/name/christen" can also be implemented as a state verb rather than as an action verb. At this point in time, I can't make up my mind which approach is better.] And so forth. In summary, we should always try to derive potential actions from state concepts, if possible. If not, then we should create basic action roots and perform additional derivations as described above. Now, it is extremely important to keep in mind that a word with subtle shades of meaning like the English word "scold" is not likely to have exact counterparts in other natural languages, and deriving a close (but not quite exact) counterpart using a state root rather than a specific action root will, in the long run, be much more productive. Thus, in spite of all the above, my personal preference is to derive an equivalent of the English word "scold" using a state root meaning 'ashamed' or 'embarrassed', the A/P-p classifier, and the special speech morpheme. This will not capture the precise meaning of the English word, but it will be very close and just as effective and useful. Also, if we insist on capturing the subtleties of every English word, then in fairness, we must do the same for all other natural languages, which is quite impossible. Finally, the astute reader may have realized from the start that verbs like "to lie" and "to apologize" cannot possibly be action verbs for two reasons: First, since the final, desired state of the patient is precisely known, the concept is inherently stative in nature. Second, a concept like 'forgiving' implies a relationship with the 'forgiven' entity. Thus, since the focus of an action MUST elaborate the event itself, the verb cannot possibly be an action. However, some verbs, especially speech acts, can be deceptive, and I felt that it was important to spend a little time clarifying their semantics. 2.10 FOCUSED VERSUS UNFOCUSED At first glance, it would seem that the only difference between most focused and unfocused verbs is simply syntactic, and that there is little or no difference in the actual MEANINGS. As it turns out, though, there is a significant semantic distinction between focused and unfocused verbs, as I hope to illustrate now. For starters, consider the following: AP/F-s: I think about Joan a lot. AP-s: I think therefore I am. The focused verb refers to specific situations, while unfocused verbs seem to indicate both an ability as well as an actuality. Thus, the unfocused verb can be paraphrased as "I not only have the ability to 'think', but I also put this ability to use". Another way of looking at the distinction is that unfocused verbs do not make any reference at all to a specific focus (either implied or stated). Thus, the AP-s verb "to think" could also be paraphrased as 'to be a thinker'. Note that this is very similar to the anti-passive. The difference, though, is that the anti-passive simply reduces the topicality of the object while implying that it is still taking part in the event. A completely unfocused verb eliminates the object completely. Thus, the anti-passive version of the above example would mean something like "I am THE thinker (in the current context)", while the completely unfocused version means "I am A thinker" (regardless of the current context). Also, do not confuse an unfocused verb for a focused one whose focus is strongly implied by the context but not mentioned. For example: Boss: "I want to make sure that no one screws up." Worker: "I understand." The verb "understand" must be either P/F-s with an omitted object or the anti- passive P-s [+F] because the focus is still strongly implied, even though it is not specified. Here are some other examples. In each case, try to extract the sense of the unfocused verb with paraphrases that use the focused verb. These paraphrases should sound something like "I not only have the ability to xxx, but I also put this ability to use", "I am an xxx-er", or "I am an xxx-ing kind of person": P/F-s: to understand P-s: to be discerning or astute (Note how this is similar to the distinction between "to know" and "to be intelligent".) AP/F-s: to imitate AP-s: to be a copycat/imitator P/F-s: to see P-s: to have sight/vision, to be a sighted person AP/F-d: to befriend AP-s: to be amiable/friendly P/F-s: to enjoy P-s: to enjoy oneself, to have a good time P/F-s: to need P-s: to have needs A/P/F-p: to order/command A/P-p: to give orders to A/P/F-d: to order (successfully) A/P-s: to be in charge of (someone) And so on. Note that the English versions of the unfocused verbs are almost always idiosyncratic or periphrastic. In summary, there is a definite semantic difference between focused and unfocused verbs. Also, by making this distinction, we achieve some highly useful results in a manner that is simple, consistent, and semantically precise. 3.0 NOUNS By now, it should be obvious that word design can be extremely productive in a language possessing a rich classificational morphology. This kind of morphology allows the AL designer to create a large vocabulary with semantic precision, while minimizing the number of root morphemes needed. However, so far we've only used this approach to design basic VERBS. We now need to see if a similar approach can be used to design basic NOUNS. I began my discussion of verbs by providing a large number of examples that I placed into groups based on their argument structures. I felt that this was necessary because my approach to classifying verbs is unusual (and probably unique). For nouns, though, I don't think that large numbers of examples will be needed, simply because the classes and their semantics are fairly obvious. [Incidentally, I am not aware of any other work that classifies verbs as I have done here. Initially, I was tempted to adopt the more widely accepted Vendlerian analysis which classifies all verbs into the four major categories: _state_ (e.g. "to know", "to love"), _activity_ (e.g. "to run", "to sing"), _accomplishment_ (e.g. "to sing a song", "to write a book") and _achievement_ (e.g. "to die", "to find"). However, although I experimented with these four categories, I was very unhappy with the results. The standard categories seemed too vague, and I often had difficulty deciding which category a verb belonged to. In any case, I felt that I needed a more productive system, and eventually ended up with the approach that I am presenting here.] 3.1 BASIC NOUN CLASSES Before starting, let's precisely define what we mean by the expression "basic noun". Here is the definition that I will use: A basic noun will represent an entity that has an actual physical existence (including entities from fantasy, mythology, etc.). Thus, such an entity must be composed of matter, energy, a combination of both, or time. Furthermore, characteristics which distinguish it from other entities must be verifiably physical (as opposed to functional, social, cognitive, etc). Note that my definition is purely semantic and has nothing to do with how a word is actually used in a sentence. Thus, for example, we will derive the word for "window" as a basic noun, while the word for "teacher" will be derived from a basic verb (as we illustrated earlier), even though both are used as nouns in a sentence. The word "window" is a basic noun because it can be uniquely described using only its physical properties. The word "teacher", however, must be derived from a verb, even though it represents a physical entity, because it does not differ from other related entities (such as "student" or "learner") in a verifiably physical way. In other words, we cannot distinguish a "teacher" from a "student" or determine their respective natures by examining only their physical traits. Their differences lie in what they DO, not in what they ARE. At any rate, if the above definition causes problems, we can use the following simpler (and perhaps more practical) alternative: a basic noun represents an entity that cannot be easily or logically derived from a basic verb. As it turns out, the derivational system that I am proposing here is completely reversible, making the actual definition irrelevent. In effect, a "basic verb" is any word whose class is indicated by a verb classifier, and a "basic noun" is any word whose class is indicated by a noun classifier. I will classify most basic nouns as follows: 1. An entity represented by a basic noun must consist of matter, energy, a combination of both, or time. 2. An entity of matter and/or energy represented by a basic noun must be either living or non-living. 3. A non-living entity represented by a basic noun must be either natural or artificial. Note that this is not the only way that one can classify nouns. For example, some classificational schemes make distinctions based on composition, shape, consistency, size, etc. Other schemes, especially those that have evolved naturally in some human languages, make distinctions based on concepts such as animacy and function, which often require reference to culture, religion, history, etc. However, I feel that the system proposed here is not only easier to work with and inherently neutral, but is also much more productive in its ability to generate as many words as possible from as few roots as possible. So, using this approach, we can create the following basic noun classes: matter & energy: living -> man, lizard, demon, grass, tree, bacteria non-living, natural -> storm, tornado, geyser, earthquake non-living, artificial -> computer, airplane, oven, fountain matter: living -> hand, leaf, branch, liver, acorn, ear non-living, natural -> water, salt, rock, cliff, river, island non-living, artificial -> keyboard, statue, ax, book, wharf, kitchen energy: living -> lifeforce, ghost, god, energy creature non-living -> heat, thunder, light, noise, energy, photon time: -> winter, sunset, equinox, morning, midnight I am not making a distinction between natural and artificial, non-living energy because we would be forced to make useless distinctions. For example, "light" from the sun would require a different classifier than "light" from a light- bulb. Note an important relationship that can be seen from the above list: items of pure matter or pure energy can be COMPONENTS of items that consist of both matter and energy. For example: matter & energy matter energy --------------- ------ ------ man hand lifeforce steam water heat computer wire electricity In other words, complete energized or living entities that can function independently are treated as both matter and energy, while their components are treated as just matter or just energy. I believe that the above classes are fundamental, and that any useful system should contain at least these eight classes. However, I also feel that we should provide additional sub-classes for classes that are likely to have a large number of members. For example, in the 'matter & energy, living' class, it will be useful to distinguish between plants and animals. In fact, I feel that we should create even finer distinctions, such as between 'mammal', 'bird', 'fish', 'insect', etc. In the 'matter, non-living, artificial' class, it will be useful to distinguish between substances (e.g. "plastic"), locatives (e.g. "wharf") and others (e.g. "hammer"). Finally, the substance/locative/ other distinction should also be applied to the 'matter, non-living, natural' class to allow us to distinguish between words such as "water" (substance), "cliff" (locative), and "boulder" (other). If we make these additions, our chart will look like this: matter & energy: living, vertebrates: mammals -> man, tiger, mouse, elf, unicorn birds -> hawk, ostrich, canary, penguin, duck reptiles -> lizard, frog, turtle, snake, newt fish -> trout, halibut, perch, lamprey, herring insects -> ant, crab, mosquito, grasshopper, fly other animals -> jellyfish, octopus, worm, clam, starfish plants: trees & shrubs -> tree, shrub, oak, blueberry, apple, pine other sperma- tophytes -> grass, flower, carrot, seaweed, lily other plants -> algae, bacteria, moss, fern non-living, natural -> tornado, geyser, storm, earthquake non-living, artificial -> airplane, computer, oven, robot matter: living -> hand, leaf, branch, liver, acorn, ear non-living, natural, substance -> water, sand, salt, ivory, urine, air locative -> cliff, river, island, mountain, bay, sky other -> boulder, fang, stalagmite, shell, hair non-living, artificial, substance -> plastic, benzene, whisky, sauce, glue locative -> wharf, building, kitchen, city, road other -> window, statue, ax, book, nail, button energy: living -> ghost, spirit, god, energy creature non-living -> heat, thunder, photon, noise, light time: -> winter, sunset, equinox, morning, midnight [Technically, the "insect" group includes the entire arthropoda phylum, the "reptile" group includes both the amphibia and reptilia classes, and the "fish" group includes not only pisces but all other vertebrate classes such as selachii (sharks), marsipobranchii (lampreys), etc.] Note that I use the word "locative" in the following sense: a locative noun represents an entity which typically is built in place or evolves naturally in a single location, which is extremely difficult (if not impossible) to move to a different location, which is relatively permanent, and which is typically considered a place where one can go to, remain at, or depart from. By the way, I think that most readers will agree that the words for 'living', 'non-living', 'natural', and 'artificial' should be derived from basic verbs. I also recommend that the words for 'matter' and 'energy' be derived from basic verbs since these do not describe specific entities or substances, but instead indicate particular states of actual items and substances. Thus, for example, the word for 'matter' would be the noun version of the verb meaning 'to be material' (i.e. 'to consist of matter'). We can do the same for words that are too general for the existing classifiers, such as 'plant' and 'animal'. 3.2 NOUN DESIGN ALGORITHM & EXAMPLES In order to illustrate the noun derivation system, I will use the following classifiers: matter & energy: living, animals -nembi- vertebrates, mammals -mo- birds -su- reptiles -pusta- fish -sai- insects (all arthropods) -zio- living, plants -kaya- trees & shrubs -po- other spermatophytes -tonze- non-living, natural -ji- non-living, artificial -fiu- matter: living -vau- non-living, natural, substance -fa- locative -nai- other -le- non-living, artificial, substance -niu- locative -te- other -ki- energy: living -dengi- non-living -pai- time: -be- The classifiers "-nembi-" and "-kaya-" will be used for animals and plants, respectively, that have not been assigned more specific classifiers. In the derivations that follow, we will refer to these categories as "other animals" and "other plants". Thus, the more specific categories are actually subsets of the super-categories "-nembi-" and "-kaya-", but the super-categories will be used only if a more specific category is not available. Now, let's design some words. We'll start with a word for 'water' and use the root morpheme "-gua-". Since 'water' is a natural, non-living substance, the classifier is "-fa-". Thus the word for water is "guafada". Now, what happens when we apply OTHER noun classifiers to the root "-gua-"? For example, what is the meaning of the 'matter & energy, living, mammal' form "guamoda"? Is there such a thing as a creature that is the mammalian equivalent of water? No - at least not if we wish to ensure semantic precision in our derivations. Furthermore, with a little experimenting, you'll quickly discover that it is essentially impossible to cross noun class boundaries with any degree of semantic precision - something that was very easy to do with verb classes. The reason for this is simple: nouns represent truly unique entities, and entities in different classes are seldom related in precisely definable ways. Thus, it is not practical to do with nouns what we did earlier with verbs. Instead, we have two choices: 1. Do not use classifiers with nouns at all. In effect, each noun root will belong to one and only one class, and will not be able to change class. 2. Compromise. Use roots for their MNEMONIC value, rather than for their literal meaning. In effect, the combination of classifier plus root will become a new, unique, de-facto root. Using the compromise approach, root morphemes can be combined with unrelated noun classifiers in a way that is semantically imprecise, INTENTIONALLY, but which is mnemonically useful. However, the classifier ITSELF will be semantically precise, and the combination of classifier PLUS root will also be semantically precise. For example, the root morpheme "-gua-" with the basic sense of 'water' could be used to create the following nouns: matter & energy: living, mammals -> guamoda - cetacean birds -> guasuda - duck fish -> guasaida - puffer/swellfish/blowfish reptiles -> guapustada - water snake insects -> guazioda - crab other animals -> guanembida - jellyfish trees & shrubs -> guapoda - water willow other sperma- tophytes -> guatonzeda - kelp/seaweed other plants -> guakayada - algae non-living, natural -> guajida - hot spring non-living, artificial -> guafiuda - jacuzzi (i.e. hot tub) matter: living -> guavauda - bladder (i.e. urinary) non-living, natural, substance -> guafada - water locative -> guanaida - lake other -> gualeda - puddle non-living, artificial, substance -> guaniuda - broth/soup locative -> guateda - reservoir other -> guakida - aquarium energy: living -> guadengida - water spirit, undine non-living -> guapaida - hydropower, water power time: -> guabeda - monsoon, wet/rainy season Note that the above are just suggestions. With a little effort, it may be possible to come up with even better ones. It should even be possible to develop rough rules or guidelines that would make the results of the derivational process more predictable (even though the process can never be TOTALLY predictable). For example, one very useful technique is to always state a concept using an English compound or descriptive phrase that is appropriate to each class. Here are some examples: water plant -> seaweed water mammal -> cetacean water bird -> duck water creature -> jellyfish natural energetic water -> hot spring artificial energetic water -> hot tub natural water location -> lake natural water thing -> puddle artificial water location -> reservoir artificial water thing -> aquarium living water energy -> undine non-living water energy -> hydropower water time -> wet season And so on. Another possible rule would be to always use the same root for a plant and its fruit or seed. The plant would take a plant classifier, and the fruit would take the 'matter, living' classifier. For example, if the root for 'banana' is "-gelba-", then "gelbapoda" would mean 'banana tree' and "gelbavauda" would mean 'banana (the fruit)'. The approach I'm suggesting here is appealing because it's not possible to use noun roots with unrelated classifiers in a way that is semantically precise. Instead, we must either create an extremely large set of root morphemes for nouns, or re-use the roots for their mnemonic value. I feel that the latter choice is much more preferable. In effect, the combination of classifier-plus-root becomes a de facto NEW root, even though it has the morphology of classifier-plus-root. And when looked at in this way, there is really nothing imprecise about this approach. We only need to keep in mind that the root morpheme is just a mnemonic aid. To me, it seems like a great way to re-use roots that would otherwise be underutilized. It will be especially useful when you first come across a new word, because the classifier is SEMANTICALLY PRECISE and provides a lot of information about the word. The mnemonic value of the root provides even more information, making it easier to guess the meaning of the word, and to remember its meaning in the future. Also, there is nothing typologically unnatural about this scheme. In fact, this approach is somewhat akin to English compounds such as "whitefish", "highland", "seahorse", etc. However, the noun classifiers that we are using here are sometimes vaguer and more generic than English headwords such as "land" and "horse". Thus, what I am proposing is actually much closer to what is done in some of the Bantu languages of Africa or some of the aboriginal languages of Australia and New Guinea. The main difference, though, is that the classifiers in these languages are even VAGUER than the classifiers we are using here. Thus, this approach fits in quite snugly between the opposite poles of classificational possibility. [The AL designer also has the option of creating classifiers that are even more specific than those above. For example, the 'mammal' class could be further sub-divided into 'primates', 'carnivores', 'marsupials', 'rodents', etc.] In summary, I am suggesting that we use semantic precision only when it is practical. We should re-use root morphemes as mnemonic aids when semantic precision is not practical. The alternative is to create many thousands of additional root morphemes which will have to be learned by the student and which will have little or no usefulness in the creation of additional words. 3.3 FROM BASIC NOUN TO OTHER PARTS OF SPEECH The simplest kind of derivation is to change the part-of-speech. For these, I suggest that the verb form have the meaning 'to be X', and that the other forms be interpreted in the usual way. Thus, for example, the word "guasusi" would be a P-s verb meaning 'to be duck'. The adjective form, "guasuno", could be used in expressions such as "duck chick", "duck egg", "duck colony", or "duck wing". The adverb form "guasupe" would have the meanings 'being duck', 'since it is (a) duck', 'since they are duck', etc. Note that this approach is perfectly consistent with the rules we adopted for basic verbs. Although the above derivation is useful, it will be much more productive to add verb classifiers to basic nouns. Before doing this, though, we need to define the semantics of the noun-to-verb conversion. To this end, I suggest that we first derive an illustrative 'prototype' verb form. The semantics of this prototype, which I will describe below, is intended to maximize the number of useful words that we can derive from each basic noun with semantic precision. Next, we can extract the basic state concept from the prototype and use it to derive all possible verb forms. Once these other verbs have been derived, we can then derive adjectives, adverbs, and case tags by applying exactly the same rules that we applied earlier to basic verbs. After a lot of experimentation, I decided that the best prototype form would be an A/P-d verb with the following semantics: The agent causes the patient and the entity indicated by the basic noun to physically COME TOGETHER in a manner that is characteristic of both entities. Here are some English examples: Prototype Noun A/P-d verb Semantics ---- ---------- --------- salt to salt to cause salt to come together with P e.g. He salted the stew. seed to seed to cause seed to come together with P e.g. He seeded the garden. bed to 'bed' to cause P to come together with bed(s) e.g. She 'bedded' the children. = She put the children to bed. tree to 'tree' to cause tree(s) to come together with P e.g. They 'treed' the pasture. = They planted trees in the pasture. OR They brought trees to the pasture. storm to 'storm' to cause storm(s) to come together with P e.g. Mother Nature 'stormed' us. = Mother Nature hit us with a storm. brain to 'brain' to cause brain(s) to come together with P e.g. The scientist 'brained' the android. = The scientist gave the android a brain. oven to 'oven' to cause P to come together with oven(s) e.g. He 'ovened' a cake. = He baked a cake. land to land to cause P to come together with land e.g. The pilot landed the airplane. glue to glue to cause glue to come together with P e.g. He glued the envelope. boat to 'boat' to cause P to come together with boat(s) e.g. The crew 'boated' the cargo. = The crew loaded the cargo onto (the) boat(s). pencil to pencil to cause pencil(s) to come together with P e.g. He penciled the sign with graffiti. wharf to 'wharf' to cause P to come together with wharf(s) e.g. He 'wharfed' the rowboat. = He moored the rowboat (to a wharf). ghost to 'ghost' to cause ghost(s) to come together with P e.g. The sorcerer 'ghosted' the house. = The sorcerer caused the house to become haunted. Note that, if a noun N is physically larger or inherently less movable than the patient P, it makes more sense to state the semantics as 'to cause P to come together with N'. However, if the patient is physically larger or inherently less movable than the noun, it makes more sense to reverse the order; i.e. 'to cause N to come together with P'. As you can see from the above derivations, the paraphrase 'to come together' is quite vague. In most cases, the semantics can be more precisely stated as one or more of the following: A causes P to undergo a change of state by USING N on P. A causes P to undergo a change of state by APPLYING N to P. A causes P to undergo a change of state by ADDING N to P. A causes P to undergo a change of state by GIVING N to P. and for locative nouns: A causes P to undergo a change of state by MOVING/BRINGING P to N. Here are the examples again with the "using" wording: Prototype Noun A/P-d verb Semantics ---- ---------- --------- salt to salt A changes state of P using salt e.g. He salted the stew. seed to seed A changes state of P using seed(s) e.g. He seeded the garden. bed to 'bed' A changes state of P using bed(s) e.g. She put the children to bed. tree to 'tree' A changes state of P using tree(s) e.g. They planted trees in the pasture. storm to 'storm' A changes state of P using storm(s) e.g. Mother nature hit us with a storm. brain to 'brain' A changes state of P using brain(s) e.g. The scientist gave the android a brain. oven to 'oven' A changes state of P using oven(s) e.g. He baked a cake. land to land A changes state of P using land e.g. The pilot landed the airplane. glue to glue A changes state of P using glue e.g. He glued the envelope. boat to 'boat' A changes state of P using boat(s) e.g. The crew boarded the passengers. pencil to pencil A changes state of P using pencil(s) e.g. He penciled the sign with graffiti. wharf to 'wharf' A changes state of P using wharf(wharves) e.g. He moored the rowboat (to a wharf). ghost to 'ghost' A changes state of P using ghost(s) e.g. The sorcerer caused the house to become haunted. Thus, an alternative statement of the semantics would be 'A uses N to cause a change of state in P'. Similarly, we can use the other paraphrases to obtain the same results: 'A applies N to P', 'A adds N to P', and 'A gives N to P, etc. Having several ways of stating the semantics of this construction will be very helpful, since it will allow us to clarify the meaning of a word if there is any ambiguity. Note that many of the verbs can take an oblique focus argument, since they can indicate a change of relationship between the patient and some other entity. Here are some examples (where the focus is capitalized): We 'treed' three acres WITH ELM AND OAK. They landed the plane ON RUNWAY THREE. He penciled the sign WITH GRAFFITI. I glued the envelope TO THE BOX. Considering our earlier discussions on serial verbs and case tags, it should be no surprise that the normally locative prepositions "ON" and "TO" actually introduce focus arguments. This usage will become even clearer when we derive several locative case tags later. Note that the phrases using "WITH" seem to have an instrumental sense, but the sense of instrumentality is very weak. Here are some examples with REAL instruments: He penciled the sign WITH A PENCIL FROM HIS POCKET. or He penciled the sign with graffiti USING A PENCIL FROM HIS POCKET. He glued the envelope to the box WITH CARPENTERS' GLUE. In the above examples, the capitalized phrases are not foci, but are true instrumentals. The next step is to determine the state concept (i.e., the final state) associated with the A/P-d verb. Here are the results for the previous examples: Prototype Noun A/P-d verb State concept ---- ---------- ------------- salt to salt salty, having salt seed to seed having seed bed to 'bed' being in bed tree to 'tree' having trees storm to 'storm' experiencing a storm brain to 'brain' having a brain oven to 'oven' being baked or roasted land to land being on land glue to glue being glued boat to 'boat' being aboard a boat pencil to pencil having marks/writing made with a pencil wharf to 'wharf' being moored to a wharf ghost to 'ghost' being haunted, having ghosts In the above examples, I have intentionally avoided the use of passive-only forms (such as "landed" or "salted") in the description of the state concepts, because passives have a strong implication of agency which should not be present in the state concepts themselves. With these state concepts, we can now create all possible verb, adverb, and case tag forms just as we did earlier with basic verbs. Here are a few examples: salt -> A/P-d verb 'to salt', A/P-s verb 'to keep salty', P-d verb 'to get salty', P-s verb 'to be salty', P-s adjective 'salty' bed -> A/P-d verb 'to put to bed', AP-d verb 'to go to bed', P-s verb 'to be in bed', P-s adjective 'in bed' land -> A/P-d verb 'to land (e.g. plane)', AP-d [+F] verb 'to alight (e.g. bird) ', P-s adjective 'aground/on land' ghost -> P-d verb 'to become haunted', P-s verb 'to be haunted', P-s adjective 'haunted' And so forth. [Incidentally, in order to handle a sentence like "Seven evil ghosts haunted the old mansion", we need a reflexive voice alteration that changes the A/P/F-s verb to an AF/P-s verb. We'll discuss reflexives later.] Now, let's do a few REAL derivations using some of the nouns we derived earlier from the root "-gua-". First, though, we must add a new rule to our morphology that will allow us to convert a basic noun to a verb and to change its argument structure: A basic noun can be converted to a verb with a precise argument structure by simply adding the appropriate verb classifier after the noun classifier. For example, to change the basic noun "guasaida", meaning 'puffer/swellfish', to an A/P-d verb (classifier "-pu-"), we insert "-pu-" after the noun classifier "-sai-". Thus, the result would be "guasaipusi". Now, here are some real derivations: guasaida 'puffer' -> A/P-d guasaipusi 'to put puffers into/to populate with puffers' P-s adjective guasaiseno 'having or populated with puffers' guaniuda 'broth' -> A/P-d guaniupusi 'to add broth to' P-s adjective guaniuseno 'having broth/brothy' guapaida 'hydropower' -> A/P-d guapaipusi 'to apply hydropower to' P-s adjective guapaiseno 'water-powered' guafada 'water' -> A/P-d guafapusi 'to water' (e.g. to water a lawn) P-d guafapiasi 'to get wet' P-s adjective guafaseno 'wet' 3.4 USING VERB CLASSIFIERS WITH BASIC NOUN ROOTS In the previous section, we applied verb classifiers to COMPLETE nouns; i.e. nouns which contained a root PLUS a noun classifier. We should also be able to derive useful words by applying verb classifiers to a BARE noun root. The question then becomes: how do we interpret the result? In my opinion, the only practical choice we have is to use the root for its mnemonic value, as we did above. In this case, however, we should try to extract a STATE or ACTION CONCEPT that closely relates to the meaning of the noun root. For example, the root "-gua-" = 'water' could be used to represent either of the closely related state concepts 'wet' or 'liquid'. But since 'wet' has already been derived using the complete noun, we can use "-gua-" to represent 'liquid', a binary state with default class A/P-d: guasi=guapusi -> A/P-d verb: to liquify guasesi -> P-s verb: to be liquid guaseda -> P-s noun: liquid, fluid guaseno -> P-s adjective: liquid, fluid (= in a liquid state) guasepe -> P-s adverb: fluidly guapiasi -> P-d verb: to melt, to condense Note that "guapiasi" can mean either 'condense' (initial state is gaseous) or 'melt' (initial state is solid), since the verb itself does not imply any initial state. In order to capture the meaning of the English words exactly, we would need to specify an initial state (i.e. solid or gaseous). [We will discuss how to do this later. However, this specificity is almost never needed since we almost always know the initial state of the patient because of its inherent nature or from context (e.g. "The warm weather XXXed the snow" or "The steam slowly XXXed"). In fact, the only examples I can think of where such specificity might be needed are in chemistry, where chemical names are used which do not indicate a particular solid or gaseous state (e.g. "selenium" or "hydrogen chloride"). However, even in chemistry, the initial state is usually obvious from context.] Here is a list of some basic nouns and (possible) state and action concepts: water -> liquid man -> speaking dog -> barking sun -> light/bright/aglow fire -> hot snow -> white table -> supporting bed -> sleeping wall -> separating hand -> handling/manipulating house -> safe And so on. In effect, what we are doing here is giving two distinct but related meanings to each root. One meaning represents a basic noun (e.g. 'water') while the other meaning represents a basic verb (e.g. 'liquid'). Thus, the dictionary entry for a root in our sample language MAY have two components: one for use as a mnemonic with noun classifiers and the other for use as a precise state or action with verb classifiers. However, as we will see in the next section, some basic verb roots are best used "as is" when applying noun classifiers. 3.5 USING NOUN CLASSIFIERS WITH BASIC VERB ROOTS In the previous section, we used a basic noun root meaning 'water' to derive many other basic nouns. We also gave the same root the meaning 'liquid' for use with verb classifiers. How, though, should we deal with concepts that are inherently stative in nature. For example, would the root "teyo-" meaning 'knowing/knowledgeable' be very productive in basic noun derivations? Probably not. What, for example, would correspond to 'knowledgeable fish', or 'knowledgeable natural substance'? Still, a few derivations would be useful: matter: non-living, artificial, locative -> teyoteda - school other -> teyokida - encyclopedia non-living, natural, locative -> teyonaida - oracle matter & energy: non-living, artificial -> teyofiuda - computer This does not mean that the other derivations are useless. In fact, they could be quite useful in stories and novels, since writers often create and use new words in their stories. For example, in a fantasy or mythological setting, an especially knowledgeable species of bird could be called a "teyosuda", a 'cavern of knowledge' could be called a "teyonaida", and so on. However, other stative concepts can definitely be useful. For example, the related concept of 'smart/intelligent' will be much more productive. In the sample language, we will use the root "tenci-" to represent the concept 'smart/ intelligent'. Using this root, we can derive the useful P-s words: tencisi = tencisesi = to be smart/intelligent tencino = smart/intelligent tencipe = intelligently We can also use other verb classifiers to derive words such as "tencipusi", meaning 'to smarten = to make more intelligent', "tencipiasi" meaning 'to smarten up = to become more intelligent', and so on. (Later, we'll see what happens when we add a focus to stative concepts such as 'intelligent'.) For noun derivations, we can keep the state meaning and apply it directly to the noun classifier in a way that is cogent and useful. Here are my candidates for the state root "tenci-", meaning 'smart/intelligent': matter & energy: living, mammal -> tencimoda - chimpanzee bird -> tencisuda - hawk reptile -> tencipustada - ??? fish -> tencisaida - ??? insect -> tencizioda - bee other animal -> tencinembida - octopus tree -> tencipoda - ??? non-living, natural -> tencijida - ??? non-living, artificial -> tencifiuda - robot/android matter: living -> tencivauda - brain non-living, natural, substance -> tencifada - ??? locative -> tencinaida - ??? other -> tencileda - ??? non-living, artificial, substance -> tenciniuda - ??? locative -> tenciteda - ??? other -> tencikida - ??? energy: living -> tencidengida - mind/intellect non-living -> tencipaida - intellectual power time: -> tencibeda - ??? As you can see, I did not provide meanings for all possible derivations. Perhaps it will be possible to do so later. I feel that when using a root for its mnemonic value, the result should be reasonably obvious, and I was not able to provide reasonably obvious meanings for some of the above derivations. Also, when deciding on a meaning, make sure that it cannot be more accurately derived with a different root. For example, I actually considered giving the meaning 'paper' to the word "tenciniuda", but realized that it would be much more preferable to derive the word for 'paper' from the same root used to derive the nouns meaning 'tree' and 'wood'. Ultimately, in the final stages of vocabulary design, you will have a large number of words that don't seem to be derivable in any reasonably obvious way, and for which you will not want to provide unique roots because of their rarity. At that point in time, you can use some of the unused derivations. For example, let's say we need a word for the fresh water fish called the 'muskellunge' (relative of the pike, pickerel, and sturgeon). Since these fish are large and relatively intelligent compared to most fish, we could use the word "tencisaida" (which we did not use above). In this case it's doubtful that ANY other root would be much better for such a little-known fish. Thus, "tencisai-" would become a de facto root for 'muskellunge'. And on first seeing this word, we at least immediately know that it's some kind of fish because of the "-sai-" classifier. Finally, don't forget that the above nouns can undergo still further derivation by adding verb classifiers to them as we illustrated earlier. 3.6 ABSTRACT NOUNS There are several nouns that are difficult to classify because of their inherent abstractness. Some of these nouns refer to concepts such as language (e.g. French), culture (e.g. Arab), race (e.g. Caucasian), nationality (e.g. Mexican), and religion or ideology (e.g. Christian). These, however, are all proper nouns, and I will postpone discussion of them until later, in the section on proper nouns and vocatives. There are also concepts that are more general in nature and which typically describe the activities of humans (and others), the abstract products of such activities, the components of such products, and so on. The question, though, is: What are these words? Are they nouns? Are they verbs? Or are they something else? To answer this question, consider the English words "opera", "mathematics", and "adjective". If they are inherently verbs, then why do we never use them as verbs? They are always used as nouns. And if they are inherently stative, then why can we never use them as adjectives? In fact, if they were inherently stative, we would not need to derive such words as "operatic", "mathematical", and "adjectival". The only conclusion that makes any sense is that these words are inherently NOUNS. So, if they are indeed nouns, then how do we classify them? Consider the word "opera". We might be tempted to classify it as non-living, artificial matter & energy. However, this would put it into the same category as "jacuzzi", "computer", and "automobile". For some reason or other, my mind rejects the idea that "computer" and "opera" are in the same class. And what about "mathematics", "adjective", and "poem"? Should they be placed in the non-living energy class? If so, they would be classified along with "electricity", "light", and "thunder". Again, my mind rejects this categorization. One thing that should be fairly obvious by now is that noun classification is inherently arbitrary, and that there is no way to avoid this arbitrariness. We can see logic and structure in the design of verbs, but nouns resist any truly logical classification. The reason for this is simply that nouns represent the products of an inherently random universe. For example, if you look at a diagram that classifies the animal kingdom, you'll find that some main branches have very few sub-branches, while others have numerous sub-branches with sub-sub-branches, and so on. You will also find that some entities resist accurate categorization into any single class. We can only expect that this inherent arbitrariness must be even more prevalent when dealing with more complex, abstract concepts, especially when we add concepts that represent the products of HUMAN activities. Thus, it seems to me that our only recourse in dealing with these words is to create whatever classes are needed, in the same way as we did for the non- abstract noun classes. Fortunately, I don't think we'll need many classes to achieve our goal. In fact, we need very few. Here are the ones that I feel most efficiently divide up the concept space of abstract nouns: -ta- Measurements: day, pound, meter, dollar, spoonful, basketful, lap, handful, acre, thimbleful, etc. -biu- Groups/organizations: club, company, religion/sect, government, platoon, choir, etc. -xo- Performances: opera, game, marathon, recital (poetry), sex, ceremony/ritual, lunch/meal/banquet, language/dialect, etc. -li- Performance Components and Results: equation, sentence, climax, act/scene, hexagon, number, letter, cosine, crewcut, integral, poem, hairdo, cliche, square root, "straw man", momentum, etc. -tiwa- Fields (of endeavor): history, music, physics, philosophy, linguistics, carpentry, farming, teaching, architecture, etc. -vo- Field Components (i.e. schools) and Results (i.e. styles): functionalism, objectivism, monism, corinthian/doric, baroque, etc. Many roots can be used in more than one abstract category. For example, the concept 'language' can be derived as a performance. When derived as a field of endeavor, it means 'linguistics'. When derived as a group, it means 'language community' or 'speakers of a language'. As another example, the concept 'corinthian' can be derived as a field result or style, in which case it could apply to specific architectual objects, such as corinthian columns. When derived as a field component or school, it would apply to the school of architecture that designed or studied such forms. However, it may be simpler to teach beginning students of the language just the most common form, and use it as a qualifier where necessary. With this approach, the word "opera" would be derived as a performance, and the adjective form could then be used with the noun meaning 'field/profession' to create an expression meaning 'field of opera' or 'operatic profession'. More advanced users of the language can, of course, create multiple forms with the same root. All of this implies that when a root is used for its mnemonic value with one of the above abstract categories, it should have the same mnemonic value when used with the other abstract categories. While this approach differs from the way most other basic noun categories are used, I think it is a good approach to adopt. The only exception to the above approach would be the measurements class. Measurements are really the components of performances, but I have given them a unique classifier because they form such a large and useful class. I recommend that the measurement classifier be attached to a root PLUS classifier for the non-precise measurements such as "spoonful" and "handful". It can be attached to a simple root for the precise measures, such as "gram" and "month". Many fields of endeavor can be derived from complete verbs. For example, the word representing the profession of teaching can be derived from the complete A/P/F-d verb "teyokosi", meaning 'to teach', plus the field classifier. Thus, the word "teyokotiwada" means 'teaching profession' or 'field of teaching'. Verb classifiers can be applied to abstract nouns in the same way as to other basic nouns. However, there will be one major difference. When basic nouns are used as roots in verb derivations, the roots are essentially states, not actions. For abstract nouns used as roots, the roots must be interpreted as actions, since performances, fields and so on inherently describe what an agent is doing - NOT what a patient is experiencing. Thus, the non-agentive derivations will not be very useful, and, as with all actions, the AP derivations will be activities. For example, if the root for 'mathematics' is "-mante-", then we can derive the following: mantetiwada = 'mathematics' (basic abstract noun) mantetiwano = 'mathematical' mantetiwape = 'mathematically' mantetiwapanjisi = AP-s verb 'to do/use mathematics', 'to be a mathematician' mantetiwapanjida = AP-s noun mathematician Since many nouns will be needed to represent members of professions, a unique classifier, "-neya-", has been allocated for this purpose in the sample language. Thus, the classifier "-neya-" is equivalent to "-tiwa-panji-". Using "-neya-", the word for 'mathematician' is "manteneyada" and the word for 'to be a mathematician' is "manteneyasi". Note also that the root "-mante-" would be used with its absolute state meaning (which we have not defined here) in basic verb derivations, but is used only for its mnemonic value in the abstract noun derivations. Thus, abstract nouns use roots in exactly the same way as all other basic nouns. Because of this, our earlier derivation of "teyokotiwada" meaning 'teaching profession' is probably overkill. The derivation "teyotiwada" would be just as acceptable. However, it might be a good idea to include the verb classifier for more obscure fields and professions, or when the root could be used to represent more than one field depending on the verb classifier. 3.6.1 IS IT AN ACTION OR A PERFORMANCE? In some cases it may be difficult to decide whether a concept should be implemented as an action or as an abstract noun. For example, is translation from one language to another an action or a performance? It is certainly not a state concept, since it is not clear precisely what has undergone a change of state. But if we implement it as an action, then it will be most useful as an AP/F-s verb, where AP represents the translator and F represents the item being translated. However, translation is also a profession, and so can also be implemented as an abstract noun. And by doing so, we gain an important advantage: we don't need to allocate a unique root for the concept. Instead, we can use one or more existing morphemes for their mnemonic value. This is especially important since action concepts are not as productive as state concepts in further verbal derivation. But this raises questions about other action concepts. For example, should we implement concepts such as 'smoking', 'swimming', 'speaking', 'reading' and so on as actions or as abstract nouns? After giving the matter considerable thought, I feel that the following guidelines will be the most productive as well as the most natural: 1. Purely physical, non-speech activities should be implemented as actions; e.g. 'kick', 'swim', 'smoke', 'dance', 'sing', etc. Note that these are what we have been calling basic "actions" or "activities". 2. Acts that clearly involve basic, uncomplicated, untrained, or unrehearsed human speech should be implemented as actions; e.g. 'speak', 'shout', 'curse', 'mock', etc. Note that these are what we have been calling basic "speech acts". 3. All other activities should be implemented as performances. These activities will consist of all human activities that are relatively complex, ritualistic, or artificial; e.g. 'reading', 'writing', 'translating', 'acting (movies or stage)', 'plumbing', 'woodworking', etc. 4. When in doubt, activities that are unique to humans or those which have associated professions should be implemented as performances. Others should be implemented as actions. Unfortunately, the distinction that we must make between activities and performances is inherently arbitrary, because we must base our decisions on inherently subjective attributes such as naturalness and complexity. Thus, the above are guidelines - not hard and fast rules. 3.7 MASS, COUNT, AND GROUP DISTINCTIONS Many nouns have separate forms that differentiate between homogeneous entities, individuals, and groups of individuals. These are referred to, respectively, as _mass nouns_, _count nouns_, and _group nouns_. Here are some English examples: Mass Count Group --------- ----- ----- mutton sheep flock grass blade of grass lawn, patch of grass ship fleet bowels intestine lower digestive tract leaf bunch of leaves, foliage beef steer herd/cattle hair strand of hair patch of hair rice grain of rice measure or portion of rice guts/flesh organ body wood tree grove, wood map atlas water drop of water shower Note that mass nouns are almost never used in the plural (*muttons, *beefs), while count and group nouns can often be either singular or plural (ship/ships, flock/flocks). Count nouns ALWAYS have both singular and plural versions. The mass versions of words like "ship", "leaf", and "map" would indicate the substances or materials that the item is composed of. [Incidentally, do not confuse group nouns discussed in this section with the abstract group class discussed in the previous section. Here, we are referring to natural groupings of any basic noun. The separate group class, however, refers to groups of diverse elements (typically human) linked by one or more activities specifically associated with the group. The groups discussed in this section do not imply any type of activity.] In the noun derivation scheme discussed earlier, some classes contained only count nouns while others contained only mass nouns. Specifically, in the 'matter, non-living' classes, 'substances' are inherently mass nouns, while 'locatives' and 'others' are inherently count nouns. Note especially, though, that in our derivations, the 'other' counterparts of 'substance' nouns are NOT their count equivalents. For example, 'aquarium' is NOT the count equivalent of the mass noun 'broth'. In other words, I am intentionally NOT using the substance/other classifiers to make a semantic mass/count distinction with the same root morpheme, even though such a distinction is an inherent part of the classification. There are two major reasons for this. First, we would end up wasting a basic noun classifier for a distinction that is almost always useless. (What is the count equivalent of 'broth'? What is the mass equivalent of 'aquarium'?) Second, as we saw above, we will also need to make a group distinction - and we cannot capture this new distinction in a way that is consistent with the mass/ count distinction without providing additional basic noun classifiers. Thus, it will be much better to create class-CHANGING morphemes that can specifically handle these distinctions when needed, rather than force the distinctions to be made all the time. In other words, we need to create special class-changing morphemes to make mass/count/group distinctions in the same way that we created class-changing morphemes to make grammatical voice distinctions. Before creating and applying such morphemes, however, we should first list all the noun classes and their default types; i.e., whether they are mass nouns, count nouns, or group nouns: matter & energy: living, mammal -> count: man, tiger, mouse, elf, unicorn bird -> count: hawk, ostrich, canary, penguin reptile -> count: lizard, newt, frog, turtle, snake fish -> count: trout, salmon, halibut, perch insect -> count: ant, fly, mosquito, grasshopper others -> count: tree, clam, vine, worm, carrot non-living, natural -> count: tornado, geyser, storm, earthquake non-living, artificial -> count: airplane, computer, oven, robot matter: living -> count: hand, leaf, branch, liver, acorn non-living, natural, substance -> mass: water, sand, salt, ivory, urine, air locative -> count: cliff, river, island, mountain, bay other -> count: boulder, fang, strand of hair non-living, artificial, substance -> mass: plastic, benzene, whisky, sauce, glue locative -> count: wharf, building, kitchen, city, road other -> count: window, statue, ax, book, nail energy: living -> count: ghost, spirit, god, banshee, undine non-living -> mass: heat, thunder, light, noise, energy time: -> either: winter, sunset, morning, season Thus, the default for all 'substance' and 'non-living, energy' nouns will be 'mass'. The default for all 'time' nouns will depend on context. The default for all other nouns will be 'count'. NO class of nouns has a 'group' default. There is also the question of the semantics of the class-changing morphemes. I suggest the following rules: 1. When converting a basic mass noun to a count noun, the result should indicate an entity that is both prototypical and useful in everyday human language. If there is more than one possibility, choose the smallest unit that has common usage in natural language. Examples: water -> waterdrop, thunder -> thunderclap/peal of thunder, grass -> blade of grass, etc. 2. When converting a basic count noun to a mass noun, the result should indicate a reasonably homogeneous mass that is both prototypical and useful in everyday human language. Examples: all animals and plants -> the flesh or wood, summer -> summertime. If this interpretation is not practical, then it should indicate the substances of which the entity is composed. Example: map -> ink, paper, etc. that the map is made of; i.e. 'map stuff' or 'map material' 3. When converting any count noun to a group noun, the result should indicate a composite entity that is both prototypical and useful in everyday human language. If there is more than one possible grouping, choose the most common or most useful. Examples: mountain -> mountain range, mosquito -> swarm of mosquitos, X -> group or bunch of X's. Here are the class-changing morphemes that we will use in the sample language along with a few examples: -gi- convert to count noun -jazmi- convert to mass noun -senje- convert to group noun guafagida = waterdrop guafasenjeda = (water) shower guamosenjeda = pod (of whales or dolphins) guasaisenjeda = school of puffers/blowfish guasujazmida = duck flesh/meat It is important to keep in mind that the mass/count/group distinctions should NOT be used to indicate exact magnitudes or counts, unless they are both natural and unique. For example, we could define the count version of the noun 'time' as 'day', but this just raises the question of how to handle the other time units, such as 'hour' and 'year'. Instead, I would define the count version of 'time' to be 'a while'; i.e. a non-exact unit of time that is both prototypical and useful in human language. In fact, units of measure should NEVER be derived as count versions of mass nouns. Instead, they should be derived using the abstract measurement classifier that we discussed earlier. [I will provide examples of this later, in the chapter on counts and measures.] Now, there is a potentially serious flaw with the above approach to mass/count/ group distinctions. In the lexical semantic system being proposed here, classifiers and class-changing morphemes are SEMANTICALLY PRECISE. However, the above class-changing morphemes are somewhat arbitrary. For example, the mass or substance of a 'tree' is not just 'wood' - it also includes 'bark', 'leaf', 'root', 'fruit', etc. If you do not insist on semantic precision for these distinctions, then the above system will probably be adequate for your needs. However, if this lack of precision is unacceptable to you (as it is to me), then I suggest the following solution: (1) Use the class-changing morphemes for their precise (but vague) meanings 'unit of', 'mass of', and 'group of'. These CCMs can also be used when there is only one possible interpretation that is both common and useful. (2) Use the basic noun classes to make useful but non-precise distinctions between mass and count senses of basic nouns. Thus, we WILL use the count CCM to create words meaning 'waterdrop', 'snowflake', and 'thunderclap', since there are no other interpretations that are both common and useful. Rule (2) will be most useful for non-precise mass derivations of basic count nouns. For example, if the state root "-denga-" means 'dirty' and the basic mammal noun "dengamoda" means 'pig', then we can derive the following: dengamoda -> pig (mammal) dengafada -> pork (non-living, natural substance) dengamojazmida -> pig flesh/skin/bones/etc. (mass noun) dengamosenjeda -> group/herd of pigs (group noun) In other words, use the 'natural substance' classifier for mass derivations of basic count nouns that are inherently non-homogeneous. 3.8 MORE ON ENVIRONMENTAL VERBS The earlier noun-to-verb examples using the concept of 'storm' can also be extended to other environmental concepts such as 'rain', 'snow', etc. For example, if the basic noun for 'snow' (as in "Snow covered the ground") is "xumpifada", then we can derive the following words: xumpifada -> snow (mass noun) xumpifagida -> snowflake (count noun) xumpifasenjeda -> snowbank (group noun) xumpifaseno -> snowy (P-s adjective) xumpifapiasi -> it snowed on... (P-d verb) xumpifapiano -> snowed-on, snow-covered (P-d adjective) xumpifapiadesi -> to snow (out) (middle voice 0-d [-P] verb) xumpijida -> snowfall xumpibeda -> winter [Note that the verb "to snow" is dynamic. Even though the process is slow, it still causes a change in the environment.] As discussed earlier, environmental verbs that involve simpler, more basic states (such as 'to get hot out') should be derived as basic verbs, rather than from basic 'energy, non-living' nouns. However, we will still need a noun to represent the concept of 'thermal energy'. For this, the obvious choice IS the 'energy, non-living' noun "xaupaida". Derivations using this basic noun will emphasize that thermal energy is present (P-s), is being applied (A/P-d), etc. The basic verb forms, though, are less physically or scientifically oriented, and are thus usable in more contexts. In effect, we are making a distinction between the mundane use of 'heat' and the more technical 'thermal energy'. 3.9 GENERIC NOUNS As we did with verbs, we can create generic words using noun classifiers. Unlike verbs, however, there is no need to make a distinction between states and actions. Thus, a generic root is not necessary, and the classifier becomes, in effect, both a root and a classifier with the same meaning. Here are some of the more useful examples: matter & energy: living, animal -> nembida 'animal' mammal -> moda 'mammal' bird -> suda 'bird' fish -> saida 'fish' reptile -> pustada 'reptile' insect -> zioda 'insect' plant -> kayada 'plant' non-living, artificial -> fiuda 'device', 'mechanism', 'appliance', 'apparatus' matter: living -> vauda 'organ', 'body part' non-living, natural, locative -> naida 'natural location/place/spot' other -> leda '(natural) thing', 'object' non-living, artificial, locative -> teda 'man-made location/place/spot' other -> kida '(artificial) thing', 'item', 'implement' energy: living -> dengida 'spirit/ghost' non-living -> paida 'energy' time: -> beda 'time' abstract nouns: -> tada 'unit of measure' -> xoda 'performance/activity' -> tiwada 'field/profession' Finally, all of the derivations that apply to basic nouns can also be applied to generic nouns. For example, we can create the A/P-d verb "paipusi", meaning 'to energize/activate/turn on', the P-s adjective "zioseno", meaning 'infested with insects', and so on. 3.10 ADDITIONAL NOUN CLASSES The AL designer may want to create additional classifiers to divide the semantic space of nouns into even smaller and more precise pieces. For example, additional classifiers could be created for the names of lesser-known species that are not covered above, instead of using the 'other' classifier. Thus, you could have additional classifiers for molluscs (e.g. snails), annelids (e.g. worms), echinoderms (e.g. starfish), etc. In this way, only truly obscure species would use the 'other' classifier. In fact, you might even want to consider completely eliminating the 'other' classifier, and replacing it with as many classifiers as are needed to completely represent the animal kingdom. While this may increase the learning burden on the student, it is still FAR better than having to memorize unique, unrelated root morphemes for each entity. Similarly, you may want to replace the 'plant' classifiers with several smaller classes. 4.0 CASE TAGS So far, we've discussed the major case roles of agent, patient, agent-patient, and focus, and mentioned in passing a few oblique roles, such as instrument and manner. We also spent a considerable amount of time showing how to convert verbs to case tags and adverbs. In this section, I'd like to discuss how to create oblique case tags for ANY case role, especially the more traditional ones. 4.1 REVIEW OF CASE ROLE SEMANTICS A sentence consists of a main verb and its arguments, and each argument has a case relation associated with it. For example, a sentence like: On Tuesday, John moved the crate to the storeroom with a forklift. can be analyzed as follows: move: agent -> John patient -> the crate destination -> the storeroom instrument -> a forklift time -> Tuesday In effect, the prepositions "on", "to" and "with" are LABELS; i.e., they name or 'tag' the roles played by their arguments. In English, the core roles of agent and patient are not explicitly labelled, but are indicated by the meaning of the verb and the relative positions of subject and object. Also, keep in mind that an oblique case role not only modifies the entire event headed by the verb, but it also often has a strong link to a particular core argument of the main verb. Thus, the instrumental case tag can be derived from an A/P-s verb meaning 'to use', and links the agent of the main verb to the patient of the case tag. This is so because the subject of the case tag is A and the object of the case tag is P. In effect, the subject of the case tag is the agent of the main verb. If a case tag is derived from a P/F verb, then it will link its F argument to the patient of the main verb. Adverbs are derived from intransitive verbs and link to the appropriate argument of the main verb, but do not themselves have arguments. To create case tags, then, all we need to do is start with a verb that has the appropriate functional meaning and mark it in some way to show that it is a case tag. This, of course, is exactly what we did earlier when we converted verbs to case tags and adverbs. However, we did it with a literal twist - when we convert a verb to a case tag, the 'label' sense of the case tag derives from the OBJECT of the case tag - NOT from the subject. In effect, for the 'label' sense, we actually used the noun derivation of the INVERSE of the verb. Since this may not be immediately obvious, consider our derivation of the instrumental sense of 'with': I broke the window with a hammer. I broke the window using a hammer. to use: A/P-s Whenever we convert a basic verb such as "to use" to a noun, we give it the meaning of a generic SUBJECT. If we were to use THIS noun sense for a case tag, it would give us the meaning of 'user' - NOT the item used. Thus, the 'label' sense of "instrument" actually comes from the generic OBJECT - what we might call the 'usee'. If, instead, we performed an inverse voice change on the verb and converted the result to a case tag, we would then get the 'label' sense of 'user' (which for this verb is probably not very useful). This can cause problems, however, since the case role of a generic object can be the same for different verbs derived from the same root. For example, the AP/F version of a verb differs from its P/F counterpart ONLY IN THE SEMANTICS OF THE SUBJECT - NOT IN THE SEMANTICS OF THE OBJECT. A case tag, however, must capture the semantics of the object. Consider the following examples: I watched as the soldiers surrounded the compound. 'to surround' = AP/F-d I noticed that the soldiers surrounded the compound. 'to surround' = AP/F-s I could see that the fence surrounded the compound. 'to surround' = P/F-s Obviously, we would like to create the equivalent of the English preposition "around" from the verb "to surround" (as in "They built a fence around the compound"). But which version of the verb do we use? The case role of the object is the same for all three verbs. They differ in the case role of the subject and in whether they are static or dynamic. The static/dynamic distinction is an important one, and although it does not appear in the English preposition "around", it does appear in other prepositions; e.g. "They jogged IN the park" versus "They jogged INTO the park". And since it is semantically valid, I feel that we SHOULD make this distinction, even though the English equivalent of this particular case tag is ambiguous. [We will discuss this distinction in more detail later.] This still leaves us with the problem of deciding whether to use the AP/F or the P/F verb or both. First of all, we don't need both forms for the simple reason that the main verb makes the necessary distinction, and there is no need to repeat this distinction in the case tag. The only real function of the case tag is to indicate the state of the patient while linking it to the main verb. How it got to that state is not important. Thus, only the P/F form is needed. Of course, if the patient of the main verb is also the agent, then it, by default, becomes the agent of the state indicated by the case tag. Thus, there is no need to indicate agency in this particular case tag, and doing so is simply redundant. This does not mean that a speaker should never use the AP/F version of 'around'. There will probably be some situations in which such forms could be used to indicate emphasis, subtlety, or precision. 4.1.1 NON-LINKING ADVERBS AND CASE TAGS Some readers may object to creating case tags that may be more semantically precise than their counterparts in natural languages. For example, some AL designers might want to have a preposition that has both the static and dynamic coverage of the English preposition 'around'. One possible way to accomplish this would be to have three special classifiers which would be intentionally vague, and which could be used instead of the more precise classifiers. In our sample language, these three classifiers will have the following specifications and interpretations: 0/P - Non-linking classifier "-gu-" The argument which follows this case tag is an argument of the verb and is somehow affected or potentially affected by the event, but there is no indication of who the agent is, or even if an agent exists. There is also no indication of whether the argument is affected statically, dynamically, or only potentially. 0/F - Non-linking classifier "-jo-" The argument which follows this case tag is an argument of the verb and is (perhaps) the focus of a relationship with one or more of the other arguments of the verb. There is no indication of whether the relationship is static, dynamic, or potential. "0" - Non-linking classifier "-la-" This classifier creates an adverb which modifies the verb and which is not explicitly linked to any of the other arguments of the verb. There is no indication of whether the adverb is static, dynamic, or potential. Note that distinct classifiers must be created, since all existing verb classifiers clearly indicate whether they are static, dynamic, or potential. Thus, these new classifiers are intentionally vague. The last form (i.e. "0") could be used to create equivalents of many English adverbs that end in "-ly". For some of these adverbs, it is often unclear which core argument is being linked to, if any. Consider the following examples: (1) John quickly opened the door. (2) John opened the door quickly. These two examples are almost, but not quite, synonymous. The first COULD imply (but not necessarily) that the agent was 'quick' in opening the door; i.e., he acted quickly. The second COULD be emphasizing (but not necessarily) that the door underwent rapid motion. Thus, if we wanted to be specific, we could obtain the first sense by deriving the adverb from the AP-s verb, since the agent caused himself to be 'quick'. The second sense can be obtained from the P-s verb, since it would emphasize that the door experienced the 'quick' state. If the speaker did not wish to be so precise, he could use the "0" form. Using the same reasoning, the 0/F classifier would be used to create the equivalent of the English preposition 'around' using the same root as the various versions of the verb 'to surround'. Later, we'll see an example of how to use the 0/P classifier when we discuss the _beneficiary_ case tag. Thus, by using specific forms (AP-s and P-s), we can modify the verb while indicating a link to a particular argument of the verb. By using the "0" form, we directly modify the verb. However, if there is a link to any argument of the verb (and there may NOT be one), it is not stated, although it may be implied by the context. In effect, the 0/F form modifies the verb directly with no indication of linkage to other arguments of the verb. Unfortunately, this implies that our semantics has a hole in it, since we've only covered two of three possibilities: 1. Modify the verb and link to a specific argument. 2. Modify the verb directly with no specified linkage. 3. Modify the entire event - NOT just the verb. We've covered the first two possibilities, but not the third. So, how do we modify the entire event while explicitly excluding a link to any argument of the main verb? In effect, how do we describe the state of an entire event? The answer is to describe the state of an event in the same way we describe the state of an entity - by making the event the argument of a P-s verb. Here are examples that illustrate the idea: P-s adverb: John opened the door QUICKLY. - the door underwent rapid movement as John opened it. AP-s adverb: John QUICKLY opened the door. - John acted quickly in opening the door. "0" adverb: John opened the door QUICKLY. - no specific linkage. P-s verb: (John opened the door) BE_QUICK. - the entire event happened quickly. The actual surface form of the fourth example will depend on the syntax of your AL. In effect, we are saying that the entire event portrayed by the embedded sentence experienced a 'quick' state. In other words, the embedded sentence is the subject of the P-s verb "to be quick". The major difference between using the "0" adverb and the P-s verb is in how we perceive the event. With the "0" adverb, we are observing the event from the inside by directly modifying the verb. With the P-s verb, we are observing the event from the outside. This distinction is not a very useful one in the above example, but will be useful in other situations, as we will see later when we discuss _disjuncts_. Finally, having said all of the above, I do NOT feel that use of the "0" adverb or the "0/F" case tag is a good, general-purpose solution, and I would NOT use these unlinked classifiers in a design of my own unless there is clearly no other choice (we will see examples of this shortly). Just because English has some vague prepositions does not mean that your AL should also have them. In my opinion, semantically distinct meanings should have lexically distinct representations. 4.1.2 CASE ROLE TERMINOLOGY So far, we have limited ourselves to using the descriptive terms "core" and "oblique" when referring to case roles. We also mentioned in passing the terms "primary" and "secondary" when we discussed exchange verbs. At this point, I would like to take the opportunity to review what these terms mean, since a good understanding of the distinctions between them will be useful in the upcoming discussions. When referring to case roles, the term "core" refers to roles that are part of the valency of a verb. Thus, they usually refer to the four major roles: agent, agent-patient, patient, and focus. An "oblique" case role is not part of the valency of the verb, and must be marked in some way to indicate its function. In English, oblique case roles are introduced by prepositions. However, a core argument can be made oblique by means of a grammatical voice change, such as passive or anti-passive. A "primary" case role is a role that occurs naturally in the valency of an UNCHANGED verb. Thus, a primary case role must ALWAYS be either an agent, agent-patient, patient, or focus. Furthermore, an argument remains primary even if it is made oblique by means of a grammatical voice change. For example, the agent of the verb "kill" is a primary case role, whether it appears as the subject of the verb, or as the argument of the preposition "by" in a passive voice operation. A "secondary" case role is a role that occurs naturally as an oblique argument of an UNCHANGED verb. Thus, a secondary case role can NEVER be the primary agent, agent-patient, patient, or focus of the verb. 4.1.3 CASE ROLE PHILOSOPHY Starting with the next section of this monograph, I will discuss in detail how to derive many of the traditional case tags that appear in natural languages. However, before doing so, I would like to briefly digress and comment on the philosophy of case tag design. The derivational approach that we are using here is especially advantageous because there is no need to design a linguistically complete and correct case system. Linguists have yet to agree on such a system, and I'm not sure it's even possible. However, using the approach described here, ANY verb can be converted to a case tag, as long as the result makes sense and performs the desired function. [Incidentally, it is also possible that the system I am proposing here has real theoretical validity. In other words, it's possible that there really ARE only four basic case roles, and that all of the other case relations are derivable from them. However, I am not making this claim for the simple reason that I don't know if it is true, although I suspect it is. Also, I am not a professional linguist and I do not feel qualified to make such a claim.] The derivational approach has the added benefit of providing case tags with literally precise meanings. For example, the word "teyokomiusi" is the A/P-d [+F] version of the word meaning 'to teach'. If we invert it and convert it to the case tag "teyokomiuvipe" (the literal meaning would be something like 'the teacher being'), we can use it in the following sentence instead of the preposition "under": Bill studied biology UNDER Professor Jones. By using a case tag derived from the verb meaning 'teach', we specify the true case role precisely, rather than having to use "under" metaphorically. This is highly beneficial because metaphoric use of prepositions differs widely among natural languages, and if you translate a metaphoric use of an English preposition literally into your AL, you will be misunderstood by many people who do not speak English. With the rich and flexible case system that we are describing here, metaphorical use of case tags can be completely avoided. 4.2 PRIMARY CASE ROLES Let's start by creating oblique case tags for the four primary case roles. These can be used to specify oblique A, AP, P, and F arguments in passive, anti-passive, and other grammatical voice-changing constructions. As we discussed earlier, passive and anti-passive constructions remove an argument from the argument structure of a verb and make it optional. To specify the optional argument, we could use case tags derived specifically for agent, patient, agent-patient, and focus. However, most (and perhaps all) natural languages are not so semantically precise. For example, ANY passive construction in English allows the original subject to be specified obliquely using the preposition "by", regardless of the actual case role: The window was broken by the neighbors' son. - where "by" introduces an agent. The poem was memorized by all the children. - where "by" introduces an agent-patient. The thief was seen by an off-duty policeman. - where "by" introduces a patient. The best approach, in my opinion, is to apply the voice changing morpheme directly to the part-of-speech marker. In effect, the part-of-speech marker is a combination of a generic root PLUS a generic class. We'll see how to put this to even greater use later. Thus, using this approach, the oblique case tags would be: passive: nupe For oblique expression of original subject. The English equivalent is "by". anti-passive: gape For oblique expression of original object. English does not have a formal anti-passive, so there is no standard English equivalent. where "-nu-" and "-ga-" are the morphemes we defined earlier to perform passive and anti-passive voice changes, respectively. Similar derivations can be done for other grammatical voice changes which allow a demoted argument to be expressed obliquely. [Note that by appending a class-changing morpheme directly to an terminator, we are creating a new root with the same form and meaning as the class-changing morpheme. Thus, the morpheme is, in fact, both a root and a class-changing morpheme with the same meaning.] Now, since these roots/class-changing morphemes can appear both on the verb as well as in the generic case tag, this approach is somewhat redundant. To eliminate this redundancy, I suggest that the morpheme NOT be used on the verb when the demoted argument is expressed obliquely. Here are some English examples using the passive: Bill was closing the door. The door was closing by Bill. = The door was being closed by Bill. Louise punched Bill in the stomach. Bill punched in the stomach by Louise. = Bill was punched in the stomach by Louise. Mike saw the accident. The accident saw by Mike. = The accident was seen by Mike. The cat killed and ate the mouse. The mouse killed and ate by the cat. = The mouse was killed and eaten by the cat. The boys broke three windows. Three windows broke by the boys. = Three windows were broken by the boys. The results sound odd in English, but are perfectly understandable. Keep in mind, though, that the class-changing morphemes MUST be used on the verb if the original subject (of the passive) or the original object (of the anti-passive) is not expressed obliquely. For double operations, such as double passive or double anti-passive, the double operation is just a short-cut for two more basic operations. For example, the double passive is equivalent to an anti-passive followed by a passive voice change. Thus, if one or both arguments need to be expressed obliquely, a case tag formed from the double morpheme should NOT be used, since it would not be clear which argument it referred to. Instead, case tags formed from the more basic operations should be used. 4.3 SECONDARY CASE ROLES Now, if we really need to express the roles of agent, patient, etc. PRECISELY, we can start with generic versions of the A/P-s, P/F-s and AP/F-s and invert them if necessary. When converted to case tags, these verbs will take on the 'label' meanings of 'semantic agent', 'semantic patient', etc. However, these derivations are TOO precise, since they precisely state whether they are static, dynamic, or potential. Thus, we now face the same problem we discussed earlier when we had to deal with case tags that were too precise. For a situation like this, I feel that the alternative approach discussed earlier (and to which I am generally opposed) can be legitimately used here. That solution, however, only provided us with three non-linking classifiers - "-gu-" for 0/P, "-jo-" for 0/F, and "-la-" for 0. We now need two more: 0/A - Non-linking classifier "-fia-" The argument which follows this case tag is an argument of the verb and is somehow responsible for the event. There is no indication of whether the patient is affected statically, dynamically, or only potentially. 0/AP - Non-linking classifier "-piu-" The argument which follows this case tag is an argument of the verb and is somehow both responsible for the event and affected by the event. There is no indication of whether the effect on the patient is static, dynamic, or potential. Thus, the new case tags are: Agent -> fiape Agent-patient -> piupe Patient -> gupe Focus -> jope As case tags, we could paraphrase them as "an agent being", "a patient being", etc. They can be useful when the speaker wants to intentionally add a case role where the verb does not normally allow one, as in "The three men died at the hands of the butler", where "at the hands of" would be represented by the word "fiape". And, as we will see later, they may also be used to add a new argument with the same case role as an existing argument, but whose linkage is not clear. It is important to emphasize that these case roles do NOT represent the same roles as the corresponding core arguments. Thus, they are secondary case roles. Later, we'll see how this distinction can be very useful. By now, I assume that the semantics of case roles is reasonably clear, and that creating case tags for ANY role should not be too difficult. A little practice, however, never hurts. So, in this section, I will describe how to create case tags for some of the most common, traditional case roles. In most of the following derivations, I will paraphrase the function of the case role with a standard template that will allow us to clearly and consistently capture the semantics of the case role. The template will have the form: "In the event in which X occurred, sub-event Y occurred". Here are some examples: He broke the window with a hammer. = In the event in which he broke the window, he used a hammer. He ran into the house. = In the event in which he ran, he 'became in' the house. He drove the car like a madman. = In the event in which he drove the car, he acted/behaved like a madman. I bought the car after we got married. = In the event in which I bought the car, the 'time locus' was after we got married. And so on. By using a standard template, we can avoid ad hoc solutions that will just have to be redone later. Finally, because the approach used here is derivational, we will often run into situations where several morphemes are needed to accurately express a particular case role. The result is often a word that is much longer than its counterpart in most natural languages. For me, this is not a problem. In fact, I feel that it adds considerably to the overall attractiveness of a language. However, for those who prefer more 'efficient' results, I will discuss how to considerably shorten words later, in the section on _macros_. 4.3.1 INSTRUMENT We've already mentioned the _instrument_ case role a few times in passing. Here, we'll discuss it in more detail. The instrument case role describes an entity which is used by an agent as an aid in accomplishing the event described by the verb. An instrument is not responsible for the event and is not significantly affected by the event. Thus, although the label "instrument" is universally used by linguists, a more proper term would probably be "catalyst". However, I will continue to use the traditional term "instrument". Here are some examples: John broke the window WITH a hammer. She cut the rope WITH scissors. The bear crushed the can WITH its powerful jaw muscles. He called me ON his new cellular phone. They heard the news ON channel 4. We paid for the items IN Japanese Yen. Here is an example of a standard paraphrase of this case role: He broke the window with a hammer. = In the event in which he broke the window, he used a hammer. Therefore, we can create the instrument case tag from the generic A/P-s action verb "zesi = zezoyasi" which we derived earlier and which means 'to use'. Thus, the instrument case tag is simply "zepe = zezoyape". 4.3.2 SECONDARY PATIENT, BENEFICIARY, AND MALEFICIARY A _beneficiary_ is the entity which may be INDIRECTLY affected by an event. Here are some examples: John washed the dishes FOR his wife. They took up a collection FOR John's widow. Bill bought some flowers FOR his girlfriend. She built the doghouse FOR the new puppy. He cooked supper FOR the children. A close paraphrase for the English preposition "for" in the above examples would be something like 'on behalf of', 'for the sake of', or 'in the interest of'. A more comprehensive and accurate paraphrase would be 'to attempt to have an unspecified or generic positive effect on'. The concept of an 'attempt' makes this last paraphrase more accurate because there is no indication that the beneficiary actually experiences a change of state - only that an attempt is made. The label "beneficiary", however, is something of a misnomer, as can be seen in the following examples: He set the trap FOR the raccoon. I bought the itching powder FOR my roommate. The first example is sometimes called a _maleficiary_, since the intended effect is clearly detrimental. The second example is ambiguous, since it is not clear whether the itching powder was purchased to be used BY the roommate or ON the roommate. Thus, a more appropriate name for this case role is _secondary patient_, since it is not always obvious if the intended effect is good or bad. Here is an example of a standard paraphrase of this case role: He cooked supper for his wife. = In the event in which he cooked supper, he was attempting to have an unspecified effect on his wife. Thus, the semantics of the secondary patient are simple: the agent of the main verb is responsible for the main event while attempting to have an unspecified effect on a secondary patient. The context determines whether the intended effect is positive or negative. And the outcome is uncertain. Thus, in the sample language we are using here, the case tag is simply the 0/P derivation "gupe" which we derived earlier. Note that "gupe" can also be used where there is no agent, as in the following examples: That fish tasted lousy TO me. Sometimes, he's very nasty TO her. The play was boring TO/FOR me but not TO/FOR Bill. The trip was wonderful FOR all of us. The box was too heavy FOR him. Finally, if we need to specifically indicate that a good or bad effect is being attempted (i.e. 'on behalf of' versus 'against'), we can use the 0/P versions of the verbs 'to benefit' or 'to harm', respectively. 4.3.3 COMITATIVE (also called ACCOMPANIMENT or ASSOCIATIVE) The _comitative_ case role introduces additional participants in an event which are equal in function and importance to the SUBJECT of the verb. The English prepositions "with" or "along with" are normally used to mark the comitative case role. Here are some examples: He weeded the garden WITH his wife. They went to Boston WITH the children. She died in a plane crash ALONG WITH three other passengers. I ate supper WITH my family. [Do not confuse this usage with the instrumental sense of the word "with", as in "I ate supper WITH a fork", or with the manner sense as in "I washed the crystal WITH care". Natural languages have a bad habit of overloading their case tags. To add to the confusion, they rarely do it in the same way.] This case role is an unusual one, because it is actually an alternative to coordination, which is normally handled syntactically. Thus, the first example can be paraphrased as "He and his wife weeded the garden". Some readers may argue, however, that use of the case tag "with" implies a certain degree of subordination which is not implied as strongly when using coordination. This apparent subordination, however, is a pragmatic effect - not a semantic one - and is implied by the context. In different contexts, the subordinating effect can be reversed: Billy went to the movies with his parents. (Billy accompanied his parents, and he was somehow subordinate or less in-control than his parents.) The Simpsons travelled to Boston with the children. (The children accompanied their parents and were somehow subordinate or less in-control than their parents.) Thus, the implication of subordination, if any, can work both ways. Note, though, that the comitative argument is certainly less TOPICAL than the subject, which is to be expected since it is oblique. In other words, the comitative case tag introduces an argument that performs exactly the same semantic role as the subject. The only difference is that the argument is reduced in topicality compared to the subject. Consider the following: 1. Dad went to a movie WITH the kids. 2. The kids went to a movie WITH dad. In (1), "dad" is more topical than "the kids", while in (2) "the kids" is more topical than "dad". In both sentences, though, "dad" and "the kids" play EXACTLY the same semantic role. Now, if "Dad" and "the kids" were equally topical, we would instead say something like this: 3. Dad AND the kids went to a movie. In other words, a coordinating conjunction does not imply any significant difference in topicality. However, we can reduce the topicality of a PART of the subject by using the comitative. This should certainly sound familiar, since reducing the topicality of an argument is exactly what a grammatical voice change does. The only difference is that the comitative reduces the topicality of only a PART of the subject. In spite of this, it is still a grammatical voice change. Now that we've discussed the semantics of the comitative case role, let's see how we can implement it. At first sight, it seems that we have at least five options: Option 1: We can create A/P, AP/F, and P/F verbs with the general meaning 'to do with/be with/accompany' and derive case tags from them. The problem with this approach is that it is far too precise, since the case tags imply strong links to specific arguments of the main verb, and they precisely state whether they are static or dynamic. Thus, using this approach, we would have to create several "-s" and "-d" versions, even though natural language case markers are rarely, if ever, so precise. Option 2: We can create a 0/F verb using the same root as is used in the P/F-s verb 'to be with'. The case tag version of this verb would have the same range of role coverage as the comitative function of most natural languages, including English. However, it fails to capture the semantics correctly. Consider the following: He weeded the garden with his wife. = He weeded the garden 'being with' his wife. The 0/F case tag simply states that his wife was present - it does NOT indicate that she also did some of the weeding. Note that this objection also applies to option 1. [Incidentally, this same root can also be used to create words such as the AP/F-s verb 'to accompany', the reflexive AF/P-s verb 'to bring', the AP/F-d verb 'to join', the AP-d [+F] verb 'to join up', the A/F-d [+P] verb 'to incorporate/admit', the A/F-s [+P] verb 'to include', the P/F-s verb 'to be with', the P-s [-F] adverb 'along', etc. And, as we'll see later, it can also be used to create the reciprocal verb 'to assemble/gather'.] Option 3: We can insist that coordination be used instead of a case tag. Thus, the language would not allow a sentence like "He weeded the garden with his wife". Instead, it would have to be stated as "He and his wife weeded the garden". We could also create a conjunction that intentionally implies a certain degree of subordination, as in "He and-to-a-lesser-degree his wife weeded the garden". However, a conjunction does not reduce the topicality of an argument. Option 4: We could use the secondary agent, agent-patient, patient, and focus case tags which we derived earlier; i.e. "fiape", "piupe", "gupe", and "jope", respectively. However, this solution is not correct because these are SECONDARY case roles, and the roles they indicate may not be the same as the primary case roles. For example, as we saw when we discussed the beneficiary case role, a secondary patient may be somehow affected by the event, but not in the same way as the primary patient. Option 5: Thus, what we really need is a PRIMARY case role; i.e., one that indicates the same role as the subject of the verb. Consider the following sentence: She died in the plane crash with three other passengers. Here, the comitative entities "three other passengers" experienced exactly the same fate as the subject. Compare this with the beneficiary case role discussed earlier, where the secondary patient does NOT experience the same effect as the primary patient. Since the comitative is, in effect, a grammatical voice change, we can create a new class-changing morpheme that demotes a PART of the subject and makes it oblique. We can call this class-change the 'cosubject' derivation: cosubject -ne- demotes part of the subject and makes it obliquely expressable Thus, the corresponding case tag will be "nepe". As with passive and anti-passive, there is no need to mark the VERB with the cosubject morpheme, since doing so would be redundant. However, marking the verb IS useful if the demoted entity is NOT being expressed obliquely. In this case, the marked verb would imply that it has a cosubject even though the cosubject is not being expressed. We'll see examples of this later. We will also need a class-changing morpheme to indicate that an entity is specifically being EXCLUDED as a possible subject. The corresponding case tag will have the meaning 'without': non-subject -sau- an entity is specifically excluded from being subject Thus, the case tag meaning 'without' is "saupe". We will see additional uses for both case tags later, in the section on _modality_. Finally, some languages also have comitative case tags that link to the OBJECT of the verb rather than to the subject (the only language I know of that can do this is Mayali, Gunwinyguan family, Australia). English can do this occasionally, but only when the semantics and context make it impossible to interpret a link with the subject, as in "Bob sent her some flowers yesterday WITH a get-well card". Since this usage is quite rare, I recommend against creating a unique class-changing morpheme for it. (In fact, in my dialect of English, this usage sounds distinctly "odd", and can be just as easily implemented as "flowers AND a get-well card".) 4.3.4 LOCATION Most languages, including English, have several verbs that are inherently locative in nature, such as "to enclose", "to enter", "to arrive", "to exit", "to put", "to lower", etc. Almost all of these words can be derived from roots that will also be useful in the derivation of locative case tags and many other useful verbs, adverbs, and adjectives. For example, "to raise" is the A/P-d verb formed from the root meaning 'up (unfocused)' or 'above (focused)'; i.e., 'agent causes patient to become above an unspecified focus'. This root concept of 'up/above' can also be used to create the words meaning 'to rise', 'above', 'up', 'upwards', and so on. To illustrate this process, let's start with the basic state concept meaning 'located at' and try to derive as many useful words as possible from it. For this illustration, we will use the state verb root "-me-". Here is an example of a standard paraphrase of the English preposition "at": John studied law at Harvard. = In the event in which John studied law, he was at Harvard. And here are some of the words we can create using this concept (the default class for "-me-" will be P/F-s): mesi = memasi = P/F-s = 'to be located at/in' e.g. John mesi Boston = John is in Boston. mededa = memadeda = middle F-s [-P] noun = 'location/position' e.g. Its mededa is a secret = Its location is a secret. mepe = memape = P/F-s case tag = 'at/in' e.g. John works mepe Boston = John works in Boston. mesepe = P-s adverb = '(at) someplace', '(at) somewhere', 'at some unspecified place' e.g. He lost it mesepe = He lost it somewhere. medosi = P/F-d = 'to become located at', 'to get to' e.g. How did the table medosi the other room? = How did the table get to the other room? medope = P/F-d case tag = 'to' (destination location) e.g. I sent it medope Boston = I sent it to Boston. He ran medope the house = He ran to the house. mepiape = P-d adverb = '(to) someplace', '(to) somewhere', 'to some unspecified place' e.g. They moved it mepiape = They moved it somewhere. meguipe = P/F-p case tag = 'towards' (i.e. potential destination) e.g. We ran meguipe the house = We ran towards the house. Now, if we had a way to negate the meaning of the root, we could create words with meanings such as 'to be not-at = to be away from', 'to become not-at = to get away from', and so on. To accomplish this, I will introduce the new class- changing morpheme "-na-" with the meaning 'not' or 'other than'. With this suffix, we can derive a few more useful words: menasi = menamasi = P/F-s = 'to be not located at', 'to be away from' e.g. John menasi Boston = John is away from Boston. menape = menamape = P/F-s case tag = 'not at/in', 'away from' e.g. John attends school menape his home town = John attends school away from his home town. menasepe = P-s adverb = '(at) elsewhere', '(at) somewhere else', '(at) someplace else' e.g. They found it menasepe = They found it somewhere else. menadosi = P/F-d = 'to become located away from', 'to get away from' e.g. The boat menadosi the wharf = The boat got away from the wharf. menadope = P/F-d case tag = 'from', 'away from' (source location) e.g. I sent it menadope Boston = I sent it from Boston. He ran menadope the house = He ran (away) from the house. They cut the rope menadope the fence = They cut the rope away from the fence. menapiape = P-d adverb = 'away', '(to) elsewhere', '(to) somewhere else', '(to) someplace else' e.g. They chased the dog menapiape = They chased the dog away. He stored the papers menapiape = He moved the papers somewhere else. Note that the following useful words can also be derived from the root "me-", even though they are not needed to form case tags that represent English prepositions: A/P/F-d: mekosi = 'to move to', 'to place at/in', 'to put at/in' e.g. We mekosi the barrels the backyard. = We moved the barrels to the backyard. menakosi = 'to move away from', 'to remove from', 'to take away from' e.g. We menakosi the books the shelves. = We removed the books from the shelves. A/F-d [+P]: ("-ga-" is the affix for the anti-passive) menakogasi = 'to rid (of)' e.g. The workers menakosi the building gape mice. = The workers rid the building of mice. [Note that by using the anti-passive case tag "gape", it is not necessary to mark the verb itself as anti-passive.] A/P-d [+F]: ("-gue-" is the affix for the anti-anti-middle) mekoguesi = 'to put away', 'to put out' e.g. Please mekoguesi your toys. = Please put away your toys. menakoguesi = 'to take away', 'to take out' e.g. The trashman menakoguesi the old TV. = The trashman took away the old TV. A/F-d [-P]: ("-xi-" is the affix for the anti-middle) menakoxisi = 'to evacuate', 'to clear out' e.g. The police menakoxisi the village. = The police evacuated the village. AP/F-d: mesuasi = 'to reach', 'to arrive at/in', 'to come to' e.g. We mesuasi Atlanta at 3 PM. = We reached Atlanta at 3 PM. menasuasi = 'to leave/depart (transitive)', 'to go from' e.g. She just menasuasi the meeting. = She just left the meeting. AP-d [+F]: ("-miu-" is the affix for the anti-anti-passive) mesuagasi = 'to take a place', 'to position oneself', 'to settle down (idiomatic)' e.g. We mesuagasi as soon as we arrived. = We took our places as soon as we arrived. menasuagasi = 'to go out/off', 'to head out', 'to take off' e.g. We menasuagasi at 6 AM every morning. = We headed out at 6 AM every morning. A/P/F-s: metuesi = 'to keep at/in' e.g. She metuesi the stallion the barn. = She keeps the stallion in the barn. menatuesi = 'to keep away from' e.g. I menatuesi the dogs the chicken coop. = I keep the dogs away from the chicken coop. A/P-s [+F]: metuemiusi = 'to constrain', 'to keep in (place)', 'to limit/ restrict movement of' e.g. We metuemiusi the larger dog. = We restrict the movement of the larger dog. menatuemiusi = 'to keep out/away/at bay', 'to hold off' e.g. He menatuemiusi the mosquitos with a net. = He keeps the mosquitos away with a net. AP/F-s: mefisi = 'to attend', 'to stay/remain at' e.g. I mefisi the conference for three days. = I attended the conference for three days. menafisi = 'to avoid', 'to stay/remain away from' e.g. Bill menafisi his father at the wedding. = Bill avoided his father at the wedding. [The English verb "to avoid" can also have the sense of 'attempting to stay away from'; i.e., without implying that the agent was actually successful. To clearly indicate this sense, we could use the AP/F-p form of the verb, "menamisi".] AP-s [+F]: mefigasi = 'to stay put' e.g. I told the children to mefigasi. = I told the children to stay put. menafigasi = 'to stay out', 'to stay away' e.g. He menafigasi until sunrise. = He's staying out until sunrise. Note that in many of the English versions of the focused verbs, the focus is oblique (e.g. "to stay AT"). Thus, if we want to precisely emulate English, we will need to create and use the more verbose [+F] versions. However, this is probably not necessary because English doesn't really seem to make a distinction between the topicality of objects and the topicality of obliques. Finally, the above derivations are just examples using a SINGLE locative state concept. A language will need many other locative case tags. These tags will describe all of the possible states and relationships that are dealt with by English prepositions and adverbs, and will have such meanings as 'to be above', 'to be behind', 'to be inside', etc. In turn, these roots can be used to create many other useful words. For example, the root used to form the locative verb meaning 'to be inside' can also be used to create the oblique case tag 'inside of', the adverb 'inside', and other useful words such as English "to enter", "to insert", "inwards", "interior", etc. along with all of their opposites. 4.3.5 TIME Temporal case tags indicate the locus of an event in time. Consider the following examples: John bought the book IN March. We visited them WHEN we were in New York. They built the doghouse OVER the weekend. He lost weight SINCE the accident. She won't leave UNTIL she sees the boss. We met Janice DURING/ON our last visit. He plans to leave AT noon. I'll take a shower BEFORE I leave. Note that some English temporal case tags (e.g. "in", "over", and, "at") also have locative meanings, while others (e.g. "when", "since", and "until") have only temporal meanings. There are also locative case tags in English that are never used with temporal meanings (e.g. "along", "beneath", "against", "via"). One possible solution to the problem of creating temporal case tags would be to simply use locative case tags with temporal arguments. It is important to keep in mind, though, that different languages assign temporal meanings to locative case tags in different ways, if at all. If you decide to do this, then do it predictably and systematically. However, I do not feel that overloading locative case tags is a good way to implement temporal case tags. In my opinion, temporal case tags should be developed independently of locative case tags for the simple reason that they have very different meanings. Thus, we will have to create verbs with meanings such as 'to happen at', 'to happen after', 'to happen during', etc. Note that we can also state these verbs in terms of a position on a timeline, such as 'to be at a time locus during/after/etc'. Here is an example using our standard form of paraphrasing: John bought the car before he got married. = In the event in which John bought the car, the time locus was before he got married. Temporal case tags, like their locative counterparts, typically link to the patient of the verb, since they indicate the time locus of the patient when it experienced the state or change of state. Thus, in the above example, the case tag indicates WHEN the patient "John" underwent the change of ownership. Now, it's true that agents and foci often play their roles at the same time as the patient experiences the event, but not always. Consider the following: The riot angered the president. He realized his mistake when he got the bill. In the first example, the agent "the riot" could have occurred before 'the president became angry'. In the second example, the focus "his mistake" MUST have occurred before 'he got the bill'. Thus, temporal case tags must be derived from P/F verbs, since they describe the temporal location of the patient when it experiences the state or change of state. In fact, to emphasize the link to the patient, we will be more accurate if we paraphrase the relationship in terms of the word "being" rather than the word "happening". For example, it is more accurate to say "to BE during a time locus" rather than "to HAPPEN during a time locus". We can use the word "happen", where appropriate, in the English translations. Now, let's do a few sample derivations using the concept of 'before'. For these derivations, we will use the root "lunda-" which has a default class of P/F-s. [We'll see how this root is actually "derived" later, when we discuss tense and aspect.] Here are some of the more useful derivations: lundasi = lundamasi = P/F-s = 'to be at a point in time before', 'to precede', 'to happen/occur before' e.g. The accident lundasi the election. = The accident preceded the election. lundape = lundamape = P/F-s case tag = 'before' e.g. John got drunk lundape the party started. = John got drunk before the party started. lundasesi = P-s = 'to be at a point in time in the past relative to an unspecified focus', 'to be earlier' e.g. The accident lundasesi. = The accident occurred earlier. lundaseno = P/F-s adjective = 'past', 'previous', 'earlier' e.g. The lundaseno performances were much worse. = The past performances were much worse. lundasepe = P-s adverb = 'at a point in time in the past relative to an unspecified focus', 'earlier', 'in the past', 'previously', 'already' e.g. He left lundasepe. = He left earlier or He already left. lundadosi = P/F-d = 'to get to a point in time before' e.g. John lundadosi the party. = *John got to a time before the party. = Life went on for John until sometime before the party. = Life went on for John as the time for the party approached. = Time passed for John until sometime before the party. lundadope = P/F-d case tag = 'until sometime before', 'as the time for X approached' e.g. John washed the dishes lundadope the party. = *John washed the dishes as he got to a time before the party. = John washed the dishes until sometime before the party. = John washed the dishes as the time for the party approached. lundapiasi = P-d = 'to get to a point in the past relative to an unspecified focus' e.g. John lundapiasi. = *John got to a point in the past. = Life went on for John until some time in the past. lundapiape = P-d adverb = 'getting to a point in the past relative to an unspecified focus', 'as time passed', 'as the clock ticked off the minutes' e.g. John washed the dishes lundapiape. = John washed the dishes as time passed. lundapiano = P-d adjective = 'former', 'erstwhile' e.g. He just saw his lundapiano girlfriend. = He just saw his former girlfriend. [The P-s derivation "lundaseno" emphasizes that the situation occupied a locus in time in the past, and is thus closer in meaning to the English words "past" or "previous". The P-d derivation "lundapiano" emphasizes a transition through time in the past, and is thus closer in meaning to the English words "former" or "erstwhile".] The agentive derivations are not very useful in the real world, since they imply the ability to control the position or the movement of the patient along the timeline. The "-p" derivations are also not very useful, since they imply that the patient can POTENTIALLY move along the timeline; i.e., that time may not flow at the same rate for the patient as for everyone else. I will leave further speculation about the uses of these forms to those readers who are interested in relativity or time travel. Negative derivations with the root meaning 'not before' will be able to represent the compound concept 'at or after' for case tags and 'then or later' for adverbs. The "-s" forms will be very useful. The "-d" forms will be less useful, but will very efficiently represent indications of time that require extreme periphrasis in English. In any case, I will leave the detailed derivations as an exercise for the reader. Finally, the above derivations are just examples using a SINGLE temporal state concept. A language will need several other temporal case tags. These tags will describe all of the possible states and relationships that are dealt with by English prepositions and adverbs, and will represent such concepts as 'time after', 'time at', 'duration', 'frequency', etc. However, since time is one- dimensional, we won't need nearly as many temporal case tags as locative ones. Later, when we discuss tense and aspect, we will see how the roots for ALL temporal case tags can be effectively "derived". 4.3.6 REASON Many things happen as a result of earlier events, conditions, or situations. In English, these events are normally introduced by expressions such as "because", "because of", "in that", "as a consequence of", "(out) of", "from", etc. Here are some examples: John left early BECAUSE he had a headache. They guarded it carefully BECAUSE OF its great value. The book provides a useful resource, IN THAT it lists every restaurant a tourist should avoid. He was not allowed to participate AS A CONSEQUENCE OF his past behavior. He died OF/FROM a broken heart. They agreed to the terms OUT OF fear of retaliation. Note that some English forms are used exclusively with embedded sentences (i.e. "because" and "in that"), while the others require noun phrase arguments (i.e. "because of", "as a consequence of", "(out) of", and "from"). Since this case role represents the most basic form of indirect causation, we can paraphrase it as follows: John left early because of a headache. = In the event in which John left early, the indirect cause was a headache. Thus, in order to derive this case tag, we need the verb meaning 'to cause', which we derived earlier. Here it is again: A/P-d: "veyasi = veyapusi" - 'to cause/make/have/create', 'to cause to come into existence', 'to cause to be real/actual' Now, in order to create a case tag meaning 'because' or 'because of', we need to INVERT "veyasi" to create a link between the event described by the main verb and the secondary agent. Thus, the final result is the P/A-d form "veyavipe", where "-vi-" is the inverse CCM. Here's an example: John swerved veyavipe the runaway car. = John swerved because of the runaway car. Note that the verb form "veyavisi" would mean something like 'to result from', 'to be the result of', 'to come about as a result of', or 'to be a consequence of'. An example using this verb would be: The delay veyavisi the manager's lack of experience. = The delay resulted from the manager's lack of experience. Instead of "veyavipe", we could also use the generic 0/A case tag, "fiape". I can't decide which is best, since both "veyavipe" and "fiape" seem to accomplish exactly the same thing. Finally, the A/P-d case tag "veyape" is also useful, and would represent the English word "causing" in sentences such as "He swerved too quickly, causing the accident". 4.3.7 PURPOSE People often do things with specific goals in mind; i.e, with a particular purpose or intent. In English, we express purpose using such words and expressions as "to", "in order to", "in order that", "with the intent of", "so that", "for", etc. Here are some examples: John opened the window TO cool the room. Bill gave the kids candy IN ORDER TO keep them quiet. She wrote the letter WITH THE INTENT OF clarifying her earlier statements. I left SO THAT the baby could get some sleep. He shot his dad FOR the inheritance. Note though, that even though the intent of the agent is obvious, use of the case role does not always imply that the agent is successful: John opened the window to cool the room, but it was actually hotter outside. Bill gave the kids candy in order to keep them quiet, but they were noisy anyway. She wrote the letter with the intent of clarifying her earlier statements, but it, too, was misunderstood. In all cases, the agent of the main verb is attempting to cause a secondary event. Here is an example of a standard paraphrase of this case role: John opened the window to cool the room. = In the event in which John opened the window, he attempted to cause the room to become cooler. Thus, in order to express this meaning, we simply need the "-p" version of the verb meaning "to cause", which we used above to create the reason case tag. The result is the A/P-p verb "veyacesi", meaning 'to attempt to cause or bring about'. Here is an example using the verb form: John veyacesi the end of the meeting. = John tried to bring the meeting to an end. And here is an example using the case tag form "veyacepe": John lied veyacepe get his promotion. = John lied (in order) to get his promotion. Finally, a more 'cognitive' version of the purpose case tag could be created from a non-generic AP/F-s verb meaning 'to intend'. This case tag would be similar in meaning to the English expression "with the intent of". 4.3.8 MANNER The manner case tag describes HOW something happens. It can be paraphrased as "in the manner of" or "in an X manner", and answers the question "how did such-and-such occur". English implements the manner case using prepositions and adverbs. Here are some examples: He left the room IN haste. (preposition) He SECRETLY left the room. (adverb) He drove the car LIKE a madman. (preposition) He QUIETLY closed the door. (adverb) We've already seen how to convert verbs to adverbs, some of which function like English manner adverbs. There are times, however, when manner cannot be indicated with a simple adverb, as in the following examples (manner case tags are capitalized): The army raced through the town LIKE a destructive tidal wave. The truck drove WITH a lot of rattling. Their singing sounds LIKE wailing banshees. They left the room IN a great hurry. The preacher berated the congregation AS IF they were naughty children. Most manner case roles are indicated in English with the preposition "like". However, even here, we have two distinct senses. Consider the following: He drove the truck LIKE a tank. He drove the truck LIKE a madman. In the first sentence, the word "like" describes the behavior of the patient. In the second sentence, it describes the behavior of the agent. Also, the first sentence itself has two distinct interpretations: He drove the truck causing it to be like a tank. He drove the truck as if it were a tank. The best way to capture these distinctions as precisely as possible is to create case tags from a root morpheme meaning 'like' or 'similar'. In the sample language, we will assign the root "-lo-". The three verb derivations that are most useful here are as follows (default class = P/F-s): P/F-s: losi=lomasi 'to be similar to', 'to resemble', 'to be like' AP/F-s: lofisi 'to act similar to', 'to imitate' A/F-s [-P]: lotuexisi 'to cause to be similar to', 'to approximate' When converted to case tags, these three verbs will provide the needed semantics for the above manner expressions: "like a tank", "like a madman", "like a destructive tidal wave", "like wailing banshees" and "as if they were naughty children". In the last item, "as if they were naughty children" would be expressed as "like naughty children" where "like" would be implemented using the P/F form. Now, while the above three derivations are semantically precise, some people may object to having to learn three case tags instead of just one. In a situation like this, it may be advisable to use the non-specific 0/F classifier discussed earlier plus the root meaning 'similar'. Using this approach, the single, all-purpose, manner case tag will be "lojope". For the metaphoric example "the truck drove WITH a lot of rattling", we do not need a case tag at all. We simply use the simpler expression "the truck drove and rattled a lot" or "the truck drove, rattling a lot". For the metaphoric example "they left the room IN a great hurry", we again simplify, using something like "they left the room, hurrying a lot". Finally, manner is sometimes expressed in initial and final forms. Here is an example: His behavior changed from that of an arrogant prince to that of the humblest peasant. Here the expressions "from that of" and "to that of" can be implemented with the P/F-d forms "lonadope" = 'becoming not similar to' and "lodope" = 'becoming similar to'. 4.3.9 CASE TAGS FOR EXCHANGE VERBS In the section on verb semantics, we discussed the need for the additional case roles of secondary agent-patient and secondary focus in sentences such as: John sold the book to Bill for five dollars. Here, "John" is the primary agent-patient, "the book" is the primary focus, "Bill" is the secondary agent-patient, and "five-dollars" is the secondary focus. These are secondary roles because they do NOT play the same roles as the corresponding roles for the main verb, but they DO take part in the change of possession. English uses two separate case tags to indicate the secondary agent-patient, depending on the direction of transfer. These are "to" and "from", and are often referred to as _recipient_ and _donor_, respectively. However, there is really no need to implement two case tags, since the verb always indicates the direction of transfer. For example, in the following sentences, it's obvious who is the donor and who is the recipient: John sold the book "to or from" Bill. Bill bought the book "to or from" John. Sally gave the book "to or from" Mike. And for verbs like "swap", we use the more neutral preposition "with" for the secondary agent-patient: John swapped his book for a magazine WITH Bill. Thus, there is no need to implement two case tags for the "to/from" roles. Whenever there is a change of possession, the role indicated by the English prepositions "to", "from", and "with" is always specified by the verb, and using the case tag to indicate the direction of transfer is simply redundant. Also, note that the secondary focus uses the same English preposition "for", REGARDLESS of the direction of transfer, as in: He sold the bike FOR $50. He bought the bike FOR $50. He swapped the book FOR a magazine. Thus, both secondary roles can be derived in exactly the same way as the secondary patient (i.e. beneficiary) that we discussed earlier, by applying the appropriate classifier directly to the case tag terminator. Specifically, for the secondary focus, we only need the classifier for 0/F "-jo-" plus "-pe", with the final result "jope". For the secondary agent-patient, we need the 0/AP classifier "-piu-", with the final result being "piupe". 4.3.10 STATE Many verbs allow an expression that provides more information about the final state of the patient. Here are some examples: He drilled the board FULL OF HOLES. He sliced the meat INTO SMALL PIECES. The crowd shouted itself HOARSE The crowd shouted itself INTO A FRENZY. Linguists call these constructions _resultatives_. Its also possible, though, to specify initial states: He changed FROM A SOFT-SPOKEN LIBERAL to a religious fanatic. He built the doghouse OUT OF SCRAP LUMBER. He worked the gold FROM AN INGOT into a flat sheet. It's also possible (although rare) to specify a steady state. Compare the following two sentences: He kicked the door OPEN. (change-of-state) He held the door OPEN. (steady-state) In English, most steady states are handled with adverbs, as in the following examples: They GLADLY tagged along. He QUIETLY ignored his brother. She imitated her boss CONVINCINGLY. The lights are flashing RAPIDLY. Colors, though, are generally used in their adjective forms: The lights glowed red and blue. Thus, in English, initial states are introduced by the prepositions "from" or "out of". Final states which are represented by noun phrases use the prepositions "in" or "into". Final states which are represented by adjectives do not use any case role marker. Steady states use either adjectives (rarely) or adverbs (frequently). All of these situations can be dealt with quite easily in the current framework. For the manner case role, we introduced the verb "to be similar to" and its derivatives. For the state case role, we will need a verb meaning 'to be equal to' or 'to be the same as'. For example, if the root "-kapsu-" represents the state concept of 'equality', then the following case tags can be derived (default class = P/F-s): P/F-d: kapsudope = English "to/into" (literally "becoming the same as") kapsunadope = English "from/out of" (literally "becoming not the same as") Where English uses adjectives or adverbs, we would use the adverb form of the appropriate P-s or P-d verb. Also, note that the P/F-s verb "kapsusi=kapsumasi" is equivalent to the English copula "to be". However, since the concept of 'being' is an inherent feature of all of our P-s verbs and basic nouns, this verb will have very little use in our sample language. Consider the following: John is intelligent: John kapsusi tencida. OR John tencisi. John is a duck: John kapsusi guasuda. OR John guasusi. However, the verb "kapsusi" will still be useful when emphasis is needed or when the noun needs modification. Finally, do not confuse state case roles with the focus case role. Consider the following: Louise ran the marathon. Louise sang an aria. In both examples, the object is a focus. If it were a state, it would describe the state of Louise. In other words, it would indicate that Louise WAS a marathon or an aria. However, neither "marathon" nor "aria" describe the state of a patient - instead, they elaborate the events. 4.3.11 MEANS or METHOD The means case role elaborates what the agent is doing. In English, this case role is normally marked by the prepositions "via", "by", or "by means of". Here are some examples: He cooled the stew BY blowing on it. She explained BY MEANS OF a story. We solved the problem BY asking for help. He knocked the chair over BY kicking it. They isolated the virus VIA a new technique. I broke the dish BY accident. As we discussed earlier, a generic A/P/F-d action verb indicates that the agent successfully affects the patient BY MEANS OF the focus, without specifying the precise action that was used. We also derived a generic AP/F-s action verb "zefisi" which means 'to do (something)' where, again, the focus elaborates what the agent-patient is doing. Thus, the focus of these verbs is actually the means case role. However, which form of the verb should we use? The A/F-s [-P] form, the A/F-d [-P] form, the AP/F-s form, or the AP/F-d form? Since the means or method is an elaboration of the event indicated by the main verb, it also can be either static or dynamic. Thus, the only choice we have is to use the 0/F classifier "-jo-". Thus, the final result is the generic 0/F action case tag, "zejope". Finally, do not confuse the means/method case role with the reason case role. For example, in "He won the race BY practicing daily", the preposition "by" is really the reason case tag, as in "He won the race BECAUSE he practiced daily". Keep in mind that the means/method case role always elaborates the event. 4.4 SUMMARY OF CASE TAG FORMS In the preceding sections, we derived several case tags. Here is a list that allows us to compare their various forms: Primary case roles: Passive -> nupe generic + passive Anti-passive -> gape generic + anti-passive Comitative ('with') -> nepe generic + cosubject Non-subject ('without') -> saupe generic + non-subject Secondary generic case roles: Secondary Agent -> fiape generic + 0/A = Reason 'because (of)' Secondary Agent-patient -> piupe generic + 0/AP = Exchange 'to/from' Secondary Patient -> gupe generic + 0/P = Beneficiary 'for' Secondary Focus -> jope generic + 0/F = Exchange 'for' Secondary non-generic case roles: Instrument 'with' -> zepe root + A/P-s Means/Method 'by' -> zejope root + 0/F Locative 'at' -> mepe root + P/F-s Locative 'to' -> medope root + P/F-d Locative 'from' -> menadope root + not + P/F-d Locative 'towards' -> meguipe root + P/F-p Temporal 'before' -> lundape root + P/F-s Reason 'because (of)' -> veyavipe root + P/A-d OR fiape generic + 0/A Purpose '(in order) to' -> veyacepe root + A/P-p Manner 'like' -> lojope root + 0/F Manner 'from that of' -> lonadope root + not + P/F-d Manner 'to that of' -> lodope root + P/F-d State 'into' -> kapsudope root + P/F-d State 'from' -> kapsunadope root + not + P/F-d 4.5 INVERSION OF CASE ROLES Verb forms of case tags can be extremely useful when they are inverted. Here are some examples: John bought the book mepe Boston. = John bought the book in Boston. Boston MEVISI John bought the book. = Boston IS WHERE John bought the book. He broke the window zejope kicking it. = He broke the window by kicking it. Kicking the window ZEJOVISI he broke it. = Kicking the window IS HOW he broke it. He saw the procession zepe a telescope. = He saw the procession with a telescope. A telescope ZEVISI he saw the procession. = A telescope IS WHAT he USED to see the procession. = A telescope IS WHAT he saw the procession WITH. And so on. The precise implementation (word order, use of infinitives, participles, complementizers, etc.) will depend on the syntax of the AL. 4.6 ADDITIONAL USES OF THE FOCUS CASE ROLE For verbs describing mental states, the focus case role indicates the entity on which the patient's mental state is targeted or "focused". The focus of such verbs is always obvious and needs no further explanation. For verbs describing physical states, however, the focus is not always obvious. In fact, many physical verbs do not appear to have a focus at all. As we will see, though, ALL verbs can have a focus. For many, though, the focus is so strongly implied by the meaning of the verb that expressing it obliquely or as a direct object would be redundant. Before we try to deal with verbs that seem to be inherently unfocused, let's first re-examine the semantics of focus in more obvious situations. Remember, for a focused state verb, the patient experiences a steady state or undergoes a change of state IN ITS RELATIONSHIP WITH THE FOCUS. For example: 1. John needs money. 2. John owns the house. 3. John bought the house. In (1), we are describing a relationship between "John" and "money". The relationship is defined by the state concept "need". In (2) and (3), we are describing a relationship between "John" and "the house". The relationship is defined by the state concept "ownership", where (2) describes a steady state and (3) describes a change of state (number 3 also implies the use of money as a secondary focus). Thus, there is a relationship between the patient of the verb and the focus. Let's extend this idea to some simple static verbs: The country is rich "focus" oil. = The country is rich in oil. I'm angry "focus" Louise. = I'm angry at Louise. The house is free "focus" termites. = The house is free of/from termites. The little girl is afraid "focus" thunder. = The little girl is afraid of thunder. The box is heavy "focus" bricks. = The box is heavy with bricks. John is proud "focus" his father. = John is proud of his father. John is happy "focus" Louise. = John is happy for/about Louise. Note that the above examples can be expressed either as P/F-s verbs where the focus is the direct object, or as P-s [+F] verbs with an oblique focus. Thus, all of the English examples above are inherently anti-passive. Do not make the mistake of analyzing the above foci as reasons or indirect causes. For example, the sentence "The country is rich IN oil" does NOT mean the same as "The country is rich BECAUSE OF oil". We can also be quite specific, as in: John is wealthy "focus" $1,000,000. = John is wealthy to the tune of $1,000,000. John is tall "focus" 6 feet. = John is 6 feet tall. The box is heavy "focus" 10 kilograms. = The box weighs 10 kilograms. The new student is intelligent "focus" 160. = The new student has an intelligence (IQ) of 160. Now, let's extend the analogy further to the dynamic and agentive counterparts: P/F-d: The tank filled water. = The tank filled with water. A/P/F-d: We filled the tank water. = We filled the tank with water. A/P/F-d: He shortened the rope one meter. = He shortened the rope by one meter. A/P/F-d: The plants grew six inches. = The plants grew by six inches. Thus, for dynamic verbs, the focus is some entity or property that the patient is associated with, and the patient changes in its relationship with the focus. Also note, that when precise quantities are specified, they represent the actual, current magnitude for "-s" verbs, and the change in magnitude for "-d" and "-p" verbs. Thus, the focus of the P/F-s "to be long" indicates the current length, while the focus of the P/F-d verb "to lengthen" indicates the change in length. We can also create examples where the focus is abstract: He formatted the document "focus" company standards. = He formatted the document according to company standards. In other words, the document is in a relationship with a company standard, and the nature of the relationship is indicated by the verb "formatted". For some verbs, though, the focus is so strongly implied that expressing it separately seems redundant: The recession impoverished his family (?of money). The cat killed the mouse (?of life). The boys broke the window (?of its structure). We should be able to apply the same logic to specify a focus for verbs that, on first examination, appear to be inherently unfocusable, even if the result is redundant. For example, what could the focus be in the following sentence: John managed the company. (A/P-s) When something is managed, it has operations or other components that can be controlled: John managed the company (?in its operations). However, if the focus adds detail that is not implied by the verb, then a specific focus is not only acceptable but very useful: John managed the company in its overseas operations. For actions, the root of the verb defines what the agent is doing - NOT the state of the patient. Thus, a focus must elaborate the action rather than any relationship with the patient. Consider the following: The warrior struck the peasant. (A/P-d) Again, we can focus the action only if it provides more detail, as in: The warrior struck the peasant a mighty blow to the head. (A/P/F-d) In other words, the focus of an action is a more detailed description of the action itself. Note that this is exactly what happens with speech acts, where the focus describes the actual message being conveyed (e.g. "John told the kids A STORY"). Thus, for many verbs, the focus is an inherent part of the meaning of the verb; i.e., it is _lexicalized_. A specific focus only makes sense if it provides more detailed information. Finally, there are indeed concepts that are inherently unfocusable. However, these are not true state or action concepts, and will not be derived as basic verbs. We'll have more to say about them later, when we discuss _deictics_. 4.7 THE OBVIATIVE VOICE Oblique phrases in a sentence modify the event described by the verb and its arguments. In addition, an oblique phrase almost always establishes a strong bond with one of the main arguments of the verb. Consider the following examples: Active, A/P-d: John killed the rat quickly with a knife. Passive, P-d [+A]: The rat was killed quickly with a knife. Middle, P-d [-A]: ?The rat killed quickly with a knife. Active, P-d: *The rat died quickly with a knife. Here, the preposition "with" is the instrumental case tag, and is derived from the A/P-s verb 'to use'. Thus, "with" can be paraphrased as 'using' in the above examples, and the agent "John" is the effective subject of the verb "use". In other words, since the subject of the case tag is an agent, the case tag binds the agent of the main sentence to the object of the case tag. If the main verb does not have an agent, the instrumental case tag "with" cannot be used since it has nothing to bind to. Note that the middle voice example is questionable because the agent is almost totally suppressed. The final example is definitely ungrammatical because the sentence does not have an agent, even an implied one. Usually, it's obvious when an argument is bound to an oblique phrase. For example, in "He shoved the box into the room", the case tag "into" clearly indicates the final state of the direct object "box" while saying nothing about the agent. The reason for this is that the case tag "into" is derived from a P/F-d verb meaning 'to become in'. Since the effective subject of the case tag is a patient, it binds to the patient of the main verb. If the main sentence did not have a patient, then the case tag could not be used. Now, there may be times when we'll need to bind a P/F case tag to the FOCUS of the main sentence. At first glance, this would seem to be impossible, since we would be trying to bind a focus with a focus. In other words, we would need to start the derivation of such a case tag with an F/F verb, which is not semantically possible. However, this need DOES exist. Consider the following examples: 1. I hear the symphony. 2. *I hear the symphony like a wailing banshee. 3. *The symphony is heard (by me) like a wailing banshee. 4. *The symphony hears like a wailing banshee. In the above examples, the case tag "like" is the manner case tag which we derived earlier and which is derived from the P/F-s verb meaning 'to be like, be similar to, or resemble'. Thus, the case tag "like" is trying to bind "a wailing banshee" to "I", since "I" is the patient of the main verb. The result, of course, is gibberish. There is a way, though, to achieve the desired effect: 5. The symphony sounds (to me) like a wailing banshee. It seems obvious that the verb "to sound" must be a derivation of the verb "to hear", but exactly what kind of derivation is it? In examples 1 and 2, the verb "to hear" is P/F-s (active), in example 3 it is F-s [+P] (passive), and in example 4 it is F-s [-P] (middle). The verb "to sound" in example 5, however, also appears to be F-s [+P], yet it is clearly not a passive construction. Is there another way to derive an F-s [+P] verb? Yes. There are two ways that an F-s [+P] verb can be derived: P/F-s -> F-s [+P] or P/F-s -> F/P-s -> F-s [+P] The first derivation is a simple passive. The second derivation is the combination of an inverse followed by an anti-passive. For the sake of brevity, I will refer to this second pathway as an _obviative_ voice change because of its similarity to a process that occurs in some natural languages. For example, Plains Cree (Algonquian) sometimes use a combination of an inverse voice change plus obviative case marking on the topicalized noun to achieve an effect that is similar to the one that we're discussing here. In our sample language, we will implement the obviative voice alteration as follows: obviative: -viga- changes P/F-x to F-x [+P], allowing case tag to bind to focal subject Note that "-viga-" is simply the combination of the inverse CCM "-vi-" and the anti-passive CCM "-ga-". I must emphasize that the semantics of the obviative differ significantly from the semantics of the passive. In the passive, the focus is made more topical than the patient. In the obviative, the focus is first made more topical than the patient by an inverse voice change. Then, it is made EVEN MORE topical by an anti-passive voice change. In other words, the topicality of the focus is increased twice. With the passive or middle construction, the topicality of the focus is increased only once. The net result is that the topicality of the focus increased so much that it became an EFFECTIVE patient, allowing itself to be bound to the manner phrase. In general, then, the obviative voice alteration allows a case tag which normally binds to the patient of a main verb to bind to the focus of the main verb. For example, the semantics of the manner bond can be paraphrased as something like 'patient experiences a relationship with focus, and focus is like X'. Let's see how productive this can be: I see Bill, and he's like a zombie. P/F-s: *I see Bill like a zombie. Passive: *Bill is seen (by me) like a zombie. Obviative: Bill looks (to me) like a zombie. I noticed/caught sight of Bill, and he was like a bat out of hell. P/F-d: *I noticed Bill like a bat out of hell. Passive: *Bill was noticed (by me) like a bat out of hell. Obviative: Bill appeared (to me) like a bat out of hell. I smell the soap, and it's like cheap perfume. P/F-s: *I smell the soap like cheap perfume. Passive: *The soap is smelled (by me) like cheap perfume. Obviative: The soap smells (to me) like cheap perfume. I taste the cake, and it's like sawdust. P/F-s: *I taste the cake like sawdust. Passive: *The cake is tasted (by me) like sawdust. Obviative: The cake tastes (to me) like sawdust. I feel the page, and it's like sandpaper. P/F-s: *I feel the page like sandpaper. Passive: *The page is felt (by me) like sandpaper. Obviative: The page feels (to me) like sandpaper. Thus, it can be useful. However, all of the above are verbs of perception that specifically refer to the five human senses. Let's see if the voice change is useful with other verbs: I own the Jaguar, and it's like a gift from heaven. P/F-s: *I own the Jaguar like a gift from heaven. Passive: *The Jaguar is owned (by me) like a gift from heaven. Obviative: The Jaguar ??? (to me) like a gift from heaven. I divorced Louise, and she was like an angel. AP/F-d: *I divorced Louise like an angel. Passive: *Louise was divorced (by me) like an angel. Obviative: Louise ??? (to me) like an angel. I love Louise, and she's like an angel. AP/F-s: *I love Louise like an angel. (wrong meaning) Passive: *Louise is loved (by me) like an angel. Obviative: Louise ??? (to me) like an angel. The only way the above sentences can work in English is by replacing the question marks by a verb such as "to seem" or "to be". In doing so, though, we lose the meaning of the original verbs 'own', 'divorce', and 'love'. In summary, English implements the obviative by means of suppletion (e.g. "see/look" and "hear/sound"), by using the same verb unchanged (e.g. "feel", "taste", and "smell"), or by periphrasis involving two distinct clauses. I strongly suspect that if English had a formal obviative construction, it would be used quite often. Finally, since combining the inverse with the ANTI-PASSIVE is quite useful, would it also be useful to combine the inverse with the PASSIVE? The answer is probably "no". Using the obviative, we topicalize the focus twice, which is quite a significant change in semantics. If we first invert and then passivize, however, we will increase the topicality of the focus and then immediately decrease it with little or no net result. 5.0 OPEN ARGUMENTS AND MODIFIERS Since verbs can be converted to oblique, verb-modifying case tags and adverbs, why not apply the same logic to create the equivalent of English prepositional phrases that can function as either noun phrases (e.g. "the man WITH THE RED HAT") or adjective phrases (e.g. "countries RICH IN OIL")? By its very nature, a verb has arguments. When other parts of speech are derived from verbs, they can also have arguments. Thus, an adverb modifies the verb but takes no additional arguments of its own. A case tag, however, modifies the verb in the same way while adding one or two new arguments to the verb. In effect, a case tag is an _open_ verb modifier, since its non-subject arguments are available for use. An adverb, however, is a _closed_ verb modifier, since it cannot take any more arguments. The same distinction can be made with other parts of speech that are derived from verbs. For example, the nouns and adjectives that we've seen so far are all closed, since they take no arguments of their own. In this section, we will discuss what happens when we 'open them up'. In order to do this, though, we first need to summarize what we've done so far, and introduce a few new rules: 1. The part-of-speech of a word in our sample language is indicated by the word terminator: -si = verb -pe = verb modifier (i.e. case tags or adverbs that modify verbs) -da = noun -no = noun modifier (i.e. adjective) -di = previous-word modifier (e.g. adverbs that modify adjectives) 2. By definition, verbs and verb modifiers are inherently open. Nouns, noun modifiers, and previous-word modifiers are inherently closed. 3. Three new terminators will be assigned that will open up the argument structure of words that are inherently closed. They are: -giu = open noun -bie = open noun modifier (i.e. open adjective) -nia = open previous-word modifier 4. An appropriate grammatical voice operation can be performed to close the argument structure of words that are inherently open. Note that rule (1) introduces a new part-of-speech indicated by "-di". Words with this ending will always modify the immediately preceding word, regardless of its part-of-speech. Thus, they can be used to implement English adverbs that modify adjectives (e.g. "RECENTLY married couple", "RAPIDLY flowing stream", etc.). And, as we will see later, this part-of-speech will be very useful in other applications. [Incidentally, we are defining this new part-of-speech as a "previous-word" modifier because, later, we will open up its argument structure and allow it to modify the word it follows while being followed by its own argument. This can be done more easily if we adopt a pure right-branching syntax. English does something like this with adjectives. An unfocused adjective precedes the noun it modifies, while a focused adjective follows the noun; cf. "rich countries" vs. "countries rich in oil".] Rule (2) is nothing new and simply re-iterates and formalizes what we've been doing all along. Rule (3) can be used to create nouns, adjectives, and adjective modifiers that take arguments. I will illustrate how to do this below. Rule (4) simply re-iterates something we already know. That is, we can apply grammatical voice operations to remove one or more arguments from a verb, effectively closing it. This will allow us to create verb-modifying adverbs that do NOT take arguments of their own from verbs that normally take objects. For example, passive forms can be used to create adverbs such as "unexpectedly", "repeatedly", "amusedly", "warnedly", etc. Anti-passive forms can be used to create adverbs with meanings such as "destructively", "lovingly", "oppressively", "knowingly", etc. Finally, because of these new additions, we must expand our definition of the terminator category to the following: Terminator ::= da | di | giu | nia | bie | no | pe | si 5.1 OPEN ADJECTIVES By opening up the argument structure of adjectives, we can create words that represent the functions of many English prepositions. Consider the following examples: the book WITH the red cover ("with" = 'having') the cup ON the table ("on" = 'being located on') the circus AT the fairgrounds ("at" = 'being located at') the can OF beans ("of" = 'containing') the magazine UNDER the box ("under" = 'being under') the pile OF junk ("of" = 'consisting of') the pound OF beef ("of" = 'consisting of' the building ACROSS the street ("across" = 'being across') the paper BY Smith ("by" = 'having Smith as agent') Note that all of the above (except the agentive "by") must use the P/F-s forms of the appropriate verb. Each open adjective will link a noun with the argument of the adjective. Here are a few derivations using morphemes we've already defined: agent -> fiabie e.g. the book by Mark Twain for -> gubie e.g. the party for Jill at -> mebie e.g. the man at the corner to -> medobie e.g. the letter to Louise from -> menadobie e.g. the letter from Louise before -> lundabie e.g. the day before the party reason -> veyavibie OR fiabie e.g. the delay veyavibie Joe = the delay caused by Joe purpose -> veyacebie e.g. the petition for his release method -> zejobie e.g. death by strangling state -> kapsunadobie e.g. the hut kapsunadobie straw = the hut made (out) of straw And so on. Note that verbs that have had their argument structure inverted (with a voice-changing morpheme) can also be converted to open adjectives. This would allow you to handle distinctions such as "the man owning the house" vs. "the house belonging to the man". I leave the actual implementation as an exercise for the reader. Open adjectives created in this way, however, are often more precise than is really needed. For example, in the noun phrase: the book WITH the red cover ("with" = "having as a component") the relationship between the book and its cover is vague enough that we don't really need to derive the open adjective from the verb meaning "to have as a component". Instead, we can use the secondary generic P/F-s open adjective, "mabie". I would also suggest using "mabie" for all of the above examples that use the English preposition "of". Keep in mind that interpretations of "mabie" can be different depending on context, since the generic root morpheme does not indicate a specific relationship. Out of context, an accurate paraphrase of "the box mabie toys" would be "the box having an unspecified relationship with toys". Thus, a likely translation would be "the box OF toys". This non-specific relationship is, of course, precisely the meaning of the English preposition "of" and its many counterparts in other languages (cf. "a box full OF toys" vs. "a box bereft OF toys"). Since the derivation of open adjectives is essentially the same as the derivation of case tags, I won't spend much more time on it here. In general, most case tags will have adjective counterparts, especially the locative ones. Also, keep in mind that different languages implement these functions in different ways. For example, in many languages, they are neither adpositions nor inflections, but are implemented as relative clauses (e.g. "the boy in the kitchen" = "the boy who is in the kitchen"). Also, a few languages, such as English, allow case tags to be used, unmodified, as open adjectives. However, I strongly recommend against this because case tags and open adjectives are semantically distinct, and because conflating them often results in attachment ambiguities. For example, in the sentence "I spoke with the lady in the storeroom", is "in" a case tag that links to the verb "spoke", or is it an open adjective that links to the noun "lady"? Besides, natural languages that use the same word for both roles, including English, often do so idiosyncratically. Consider the following: I put the box UNDER the bed. The box UNDER the bed is empty. The man walked INTO the room. *The man INTO the room is my brother. The man WHO WALKED/CAME INTO the room is my brother. He built the doghouse OUT OF plywood. *The doghouse OUT OF plywood is not as good as the plastic one. The doghouse MADE OF plywood is not as good as the plastic one. They delayed the operation BECAUSE OF his death. ?The delay BECAUSE OF his death was unavoidable. The delay CAUSED BY/DUE TO his death was unavoidable. The boy HAS a red hat. *The boy is WITH a red hat. The boy WITH the red hat is her son. In other words, sometimes the case tag is the same as the open adjective, while other times it is not. When it is not the same, it is either periphrastic or idiosyncratic. Open adjectives are not only useful for creating the equivalents of English prepositions. They can also be used to create adjective phrases. Here are a few English examples: My BEER DRINKING buddies recommend brand X. Countries RICH IN NATURAL RESOURCES should be more generous. I just bought a WOOD BURNING stove. Homes BELONGING TO THE POOR are taxed at a lower rate. ['to belong to' = inverse of 'to own'.] The number of children BITTEN BY DOGS decreased last year. ['to be bitten by' = inverse of 'to bite'.] The actual word order of the constituents of an adjective phrase will, of course, depend on the syntax of your AL. The last two examples illustrate how an inverse construction can be used where English requires either a different verb (i.e. "to belong to") or a passive construction plus a preposition (i.e. "to be bitten by"). Note that the separate words meaning 'to' and 'by' are NOT used with true inverse forms. 5.1.1 "ABOUT" VERB ARGUMENTS Some English verbs allow objects that begin with the preposition "about". Here are some examples: She wrote about her childhood. They know about the problem. We heard about his promotion. I thought about what she said. You argued about money. He told me about the project. This is especially common with speech act verbs, since the missing headword is always obvious from the meaning of the verb. Thus, "he told me about X" can be paraphrased as "he told me words about X", where "words" elaborates the speech act. However, in the system proposed here, a separate word equivalent to "about" is simply not necessary, since the argument of "about" is, in fact, the focus of the verb. Thus, in the sample language, we can simply state "She wrote her childhood". If we need to specify the degree in which the focus is involved, we can use an appropriate headword plus the generic linker "mabie": She wrote something mabie her childhood = She wrote something about her childhood. She wrote a lot/everything/nothing mabie her childhood = She wrote a lot/everything/nothing about her childhood. [We'll discuss how to derive the words meaning 'something', 'a lot', and so on later.] 5.2 OPEN NOUNS By opening up the argument structure of nouns, we can create more complex noun phrases without having to resort to the use of prepositions, relative clauses or other subordinate constructions. Here are a few English examples: CHEMISTRY STUDENTS should register tomorrow. [Here, "student" is the open noun "teyomigiu" and is immediately followed by the noun meaning 'chemistry'.] They hired a TERMITE EXTERMINATOR. BASEBALL PLAYERS get payed too much. I am no longer a COFFEE DRINKER. In all of the above, since we are using the noun version of a verb, and since this represents a generic subject, the subject position is automatically filled. Thus, we can say "chemistry students" where "chemistry" is a noun, but we cannot say "boy students", where "boy" is also a noun (although we CAN use the adjective version of "boy"). If we first invert a verb and then use its open noun form, the original subject position becomes available while the original object position becomes automatically filled. For example, the inverse-noun form of "to study" would correspond to the English words "subject" or "topic". If we then open it up, we can create an expression like "subject John", which would be equivalent to the English expressions "John's subject of study" or "the subject that John is studying". Later, when we discuss class-changing morphemes in more detail, we will be able to derive process nouns from verbs, such as "destruction" from "to destroy". If we open up process nouns, BOTH argument positions will have to be filled. This will allow us to emulate English expressions such as "the destruction of the city by the enemy" without the need for prepositions. We can also create open noun versions of basic nouns. Earlier, when we discussed the conversion of basic nouns to verbs, we saw how these verbs could be focused. Here again are some of the examples we used: We 'treed' three acres WITH ELM AND OAK. They landed the plane ON RUNWAY THREE. He penciled the sign WITH GRAFFITI. I glued the envelope TO THE BOX. Note that the object of each sentence is the patient and the argument of each preposition is the focus. If we eliminate the agent, we can get expressions like the following: three acres "of" elm and oak the plane "of" runway three the sign "of" grafitti the envelope "of" the box where "of" can be better paraphrased as "having been brought together with". Thus, we can obtain the equivalent of the above phrases by opening up a P/F-s version of the basic noun. Now, since a basic noun has the class P-s by default when converted to a verb, it makes no sense to open it up without first adding an appropriate verb classifier to give it an object. Let's take advantage of this by adopting the rule that using the open noun terminator "-giu" with a basic noun will convert the default class to P/F-s (classifier "-ma-"). In other words, if a basic noun ends with "-giu" but does not have a verb classifier, then it will be equivalent to "... magiu". Here are some examples: naida - 'natural location/place/spot' guajida - 'hot spring' naigiu guajida - 'place of/having hot springs', 'hot spring locale' teyoteda - 'school' teyofiuda - 'robot' teyotegiu teyofiuda - 'school of/for/having robots', 'robot school' guanaida - 'lake' guasuda - 'duck' guanaigiu guasuda - 'lake of/for/having ducks', 'duck lake' 5.3 ADJECTIVE MODIFIERS Closed previous-word modifiers can be used to implement English adverbs that modify adjectives. Here are some English examples: The POORLY built homes collapsed in the earthquake. He emptied the PARTIALLY filled can. The EXTENSIVELY mined pit was an eyesore. QUICK-frozen vegetables taste better than canned vegetables. I really enjoy PROPERLY prepared seafood. Note that the above adjective-modifying adverbs are the same as verb-modifying adverbs except that the terminator would be "-di" rather than "-pe". The system proposed here also allows us to create OPEN previous-word modifiers (terminator = "-nia"); i.e. words which modify adjectives, adverbs, etc. and which take an argument and link it to the previous word. We'll see how this can be useful later, when we discuss _comparatives_. 5.4 SEMANTICS OF OPEN NOUNS AND MODIFIERS Some readers may object to the creation of open nouns and open adjective modifiers, claiming that they are simply short cuts for subordinate clauses. While it's true that they can sometimes be used in this way, they are usually quite different because they cannot be modified for tense, aspect, or modality. For example, consider the following: My beer-drinking buddies think ... versus My buddies (who are) drinking beer in the corner over there think ... My buddies who shouldn't drink beer so much think ... My buddies who drank the beer that had gone bad think ... My buddies who may be drinking beer tomorrow night think ... and so on. In effect, the phrase "beer-drinking" says nothing about WHEN the event occurred, nor does it provide additional details about WHERE the event occurred, HOW it occurred, etc. In other words, an open modifier like this is indefinite, because it does not describe a PARTICULAR event. [Linguists refer to phrases such as these as _non-finite_. Phrases which are modified for tense, aspect, and modality are called _finite_.] Now, it is certainly possible to design your AL so that a subordinate clause can be explicitly marked to indicate that it is non-finite. However, the result will be longer and less iconic (i.e., long utterances should convey more information than shorter utterances). Also, if you apply this reasoning rigorously, you'll have to eliminate case tags and open adjectives, and replace them with some kind of ad hoc creations. It is important to keep in mind that the intent of open modifiers is to allow the creation of non-finite and indefinite forms which are as efficient as possible. These constructions are NOT intended to be used as shortcuts for subordinate clauses, although they can sometimes be used in this way. In general, the syntax of your AL should allow all possible forms of finiteness and definiteness in subordinate clauses, while restricting this freedom with open modifiers. For example, your syntax should NOT allow an open modifier to be marked for tense, aspect, or modality. And, as we will see later, this lack of 'definiteness' makes open nouns and modifiers extremely useful in the creation of compounds. Finally, although the argument structure of the original verb is available for use in open modifiers, keep in mind that an open modifier IS NO LONGER A VERB. Thus, it cannot be further modified by adverbs or case tags, although it CAN be modified by a modifier of the appropriate class. For example, an open noun can be further modified by an open or closed adjective, but it can NOT take an oblique argument introduced by a case tag. 6.0 CLASS-CHANGING MORPHEMES We've already discussed class-changing morphemes which perform grammatical voice changes and which make mass/count/group distinctions. In this section, I would like to discuss some of the other class-changing morphemes (henceforth CCMs) that will be needed by a language. Important Note: I will be using the abbreviation "CCM" extensively throughout the remainder of this monograph. First, though, it's important to emphasize that a CCM changes the CLASS of a word. That is, it changes the basic nature of the word - it does NOT change the basic meaning of the root. Thus, when a CCM is applied to an existing class, it, in effect, creates a new class. In the process, the way the word interacts syntactically with other words in the sentence may also change. For example, a grammatical voice alteration changes the argument structure of the verb. In our sample language, when suffixed to an existing classifier, the CCM will create a new class which, in turn, may be further modified. If a classifier is not provided, then the default class will be modified. Just to get into the swing of things, here are some new applications of CCMs that we are already familiar with: -de- Middle voice CCM The noun version of the middle voice alteration has the meaning of a prototypical, generic object of the unmodified verb. The adjective version represents the prototypical qualities of the generic object expressed attributively, but with the original subject almost completely suppressed, and thus, usually corresponds to English words ending in "-able" or "-ible". "teyosi=teyomasi" = 'to know' -> "teyodeda" = 'a fact', 'a datum', 'an item of knowledge' -> "teyodeno" = 'knowable' [Compare the above with the passive forms "teyonuda" and "teyonuno", which would have the meanings 'something which is known' and 'known', respectively. With the passive forms, the original subject still has a strong presence. In the middle forms, however, the original subject is almost completely eliminated. Even so, the English gloss 'knowable' does not precisely capture the meaning of the derivation, since it is still possible to express the original subject, as in "knowable by someone". In general, the sense of the adjective forms can be best paraphrased in English as "suitable for X-ing".] "teyokosi" = 'to teach' -> "teyokodeda" = 'pupil' -> "teyokodeno" = 'teachable' "teyomisi" = 'to study' -> "teyomideda" = 'subject' -> "teyomideno" = 'studiable' "guapusi" = 'to liquify' -> "guapudeno" = 'liquifiable' "paipusi" = 'to energize' -> "paipudeno" = 'energizable' "zesi=zezoyasi" = 'to use' -> "zedeno" = 'usable' "mesuasi" = 'to reach/ -> "mesuadeda" = 'destination' to arrive at' -> "mesuadeno" = 'reachable' [Open adjective constructions with meanings such as 'teachable by', 'usable by', etc. can be achieved with the passive CCM "-nu-" instead of the middle CCM "-de-". However, these English constructions seem to be quite rare. I believe that, in most cases, the agent is so strongly suppressed that a middle construction is more appropriate.] -senje- Group CCM "teyodeda" = 'fact/datum' -> "teyodesenjeda" = 'data' "teyomideda" = 'subject' -> "teyomidesenjeda" = 'curriculum' -jazmi- Mass CCM "teyodeda" = 'fact/datum' -> "teyodejazmida" = 'information' "teyomideda" = 'subject' -> "teyomidejazmida" = 'subject matter' Note that we did not derive words such as "knowledge" above. This is because "knowledge" implies both the facts and the ability to work with them. In other words, it involves the whole verb - not just a single argument of the verb. We'll discuss how to deal with this later. The count/group/mass CCMs can also be used as roots. The count CCM "-gi-" converts its argument to something that can be treated as one or more discrete units. Thus, it can provide a state root with the meaning 'distinct', 'discrete', or 'separate', where the focus would represent the referent (i.e., what the patient is distinct from). Since this is inherently relational, we will assign the default class P/F-s. Here are some examples: P/F-s: gino = gimano - distinct, stand-alone, separate gisi - to be distinct from P-s: giseno - particular, discrete, specific giseda - entity, thing/person/item/etc. A/P-d: gipusi - to single out A/P/F-s: gituesi - to differentiate P from F, to be the reason why P is distinct from F It is important not to confuse the above with the locative senses of 'apart' or 'away from'. The count CCM simply implies that the patient can be treated as a distinct entity (even if they are at the same location). The group CCM "-senje-" indicates that its argument consists of more than one similar or identical units that can be treated as a single, logical entity in which the parts contribute to the operation or function of the whole. Thus, it conveys the sense of the English word "group" only when it also has the sense 'company', 'assemblage/assembly', or 'association' as in "company of actors", or "association of medical professionals". It does NOT convey the sense of the English word "bunch", which implies incoherence and lack of function. Thus, an "assemblage of tools" would correspond to a 'tool kit', but a "bunch of tools" would not. An "assemblage of grass" would correspond to a 'lawn', but a "bunch of grass" would not. In other words, the group CCM implies a strong sense of purpose or association, unlike the generic state derivations which imply a very weak sense of association and which never imply a sense of common function or purpose. Thus, "-senje-" creates a functional grouping of its arguments, and, as a root it will have the meaning 'patient is a member of an association/assembly/ company/logical or functional grouping indicated by the focus'. Since this is inherently relational, the default class will be P/F-s. Here are some examples: P/F-s: senjesi - to be a member of, to belong to, to be part of senjeda - member F/P-s: senjevida - assemblage/company/grouping/association/ club/society senjevigiu - an assemblage/company/association of P/F-d: senjedosi - to become a member of AP/F-d: senjesuasi - to join AP-d [-F]: senjesuadesi - to join up AP-d [+F]: senjesuagasi - to sign up (with), to enlist/enroll (in) AP/F-s: senjefisi - to associate/affiliate with, to be in league with The mass CCM "-jazmi-" indicates that its argument is a homogeneous mass, which implies that it can be divided into a greater number of masses of equal homogeneity. Thus, the mass CCM represents the inherently scalar state concept of 'homogeneous', 'uniform', or 'consistent'. Since this is an inherently scalar state, the default class will be P-s. Here are two examples (we'll see additional examples later): P-s: jazmino = jazmiseno - uniform, homogeneous, consistent A/P-d: jazmipusi - to blend, to homogenize So, what other CCMs are needed? Below is a list of a few CCMs that I feel are most useful. The list is only partial, and I'm sure we could come up with several more. [In fact, if you have any good candidates for the list, please let me know!] -fu- Infinitive CCM The infinitive CCM is used when the verb is part of an embedded sentence and when its subject is the closest preceding argument of the outer verb. The English equivalent is the particle "to", as in "John wants to go now" or "He tried to open the door" or "I told the children to sit down". Be careful not to confuse the infinitive with the purpose case role, as in "Bill opened the window (in order) to cool off the room". -ve- Essential quality CCM and 'ability' For basic nouns, the essential and distinctive quality of the entity; i.e., what it 'has' that uniquely identifies it. For basic verbs, the essential and distinctive quality of a prototypical, generic subject. For derivations from agentive verbs, the meaning will indicate some kind of skill or capability. Most English equivalents will end in "-ness" or "-ity". To determine the closest English equivalent, ask the question: "What essential and distinctive quality does the entity or a prototypical subject of the verb 'have'?" Examples: "teyosi" = 'to know' -> "teyoveda" = 'knowledge' "teyokosi" = 'to teach' -> "teyokoveda" = 'teaching ability' "tencisi" = 'to be -> "tenciveda" = 'intelligence/ intelligent' smartness' "guaniuda" = 'broth' -> "guaniuveda" = 'brothiness' "losi" = 'to be -> "loveda" = 'similarity' similar to' "xausi" = 'to be hot' -> "xauveda" = 'heat/hotness' [Do not confuse this with the more technical term "xaupaida", meaning 'thermal energy'.] "guaseda" = 'liquid' -> "guaseveda" = 'fluidity', 'liquidness' "zesi" = 'to use' -> "zeveda" = 'control' "veyasesi" = 'to be -> "veyaseveda" = 'reality/existence' real/existent "jazmino" = 'uniform', -> "jazmiveda" = 'uniformity', 'homogeneous' 'homogeneity' "senjesi" = 'to be a -> "senjeveda" = 'membership', member of' 'affiliation', 'constituency' [English sometimes uses the words "membership", "affiliation", and "constituency" as synonyms for "members", as in "The membership will vote tomorrow". In the sample language, we can use simply "senjeda" for this purpose.] English examples: happiness, oldness, civility or politeness, reading ability/skill, friendliness, humanity (= humanness), tallness, redness, etc. Note that both nouns and adjectives can be formed from the active and passive voice derivations. For example, if the P/F-s verb meaning 'to love' is "xendasi = xendamasi", then we can derive the following: "xendano" = 'loving' "xendaveda" = 'love' "xendanuno" = 'lovable' "xendanuveda" = 'lovableness' Applying the same logic, verb forms will mean 'having the quality'. Thus, from "teyokosi" meaning 'to teach', we can derive "teyokovesi" meaning 'to have the ability to teach' or 'to know how to teach'. In effect, we are divorcing the act from the ability. Thus, the verb "teyokovesi" means that someone has the ability to teach, but not necessarily that this ability is actually put to use. Also, the generic AP/F-s action verb "zefivesi" indicates that the subject has the ability to do or perform the deed elaborated by the focus. Thus, the verb "zefivesi" is equivalent to the English verb "can" or "to be able", as in "John can swim" or "She was able to convince her husband". Note though, that this verb is not likely to be used very much in the sample language, since it is much more efficient to add "-ve-" directly to a verb. For example, it is more efficient to say "John teyokovesi" rather than "John zefivesi teyokofusi", even though both mean 'John knows how to teach' (where "-fu-" = the infinitive CCM discussed earlier). The adjective form "zefiveno" means 'able/capable', and the noun form "zefiveda" means 'an able/capable individual'. The essential quality of one who is 'able/capable' is 'ability/capability'. Thus, the word meaning 'ability/capability' is "zefiveveda". The passive adjective "zefinuveno" means 'doable'. Finally, when used as a root, "-ve-" will have the default class P/F-s with the meaning 'P has the qualities/nature/characteristics of F' or 'P is essentially/inherently F'. Thus, the sentence "John vesi a good teacher" means 'John has the qualities of a good teacher' or 'John is inherently a good teacher'. The inverse "vevisi" means 'to be the nature of', 'to be inherent to' or 'to be an essential/inherent quality of'. Thus the adjective "vevino" would be equivalent to English "inherent/essential/characteristic" and the "0" adverb "vevilape" would be equivalent to English "inherently/essentially/characteristically". The noun form, "vevida", is equivalent to the English words "nature", "quality", "essence", or "character". When referring to an active entity, "vevida" would thus include the sense 'ability'. However, "vevida" is obviously vaguer and more general than "zefiveveda" derived earlier. -pa- Process CCM For verbs, the actual process that takes place; i.e. a word which describes what is happening to a generic subject of the verb. Many English equivalents will end in "-ion", "-ing", or "-nce", but there are many exceptions. To determine the closest English equivalent, ask the question: "What process is the subject taking part in, or what is happening to the subject in a generic event or situation described by the verb?" Examples: "teyokosi" = 'to teach' -> "teyokopada" = 'pedagogy/ teaching/tutelage' "teyodosi" = 'to learn' -> "teyodopada" = 'learning/ education' "tencipiasi" = 'to become -> "tencipiapada" = 'intellectual more intelligent' growth' "guapusi" = 'to liquify' -> "guapupada" = 'liquification' = 'the process by which an agent causes something to become liquid' "guapiasi" = 'to liquify' -> "guapiapada" = 'liquification' 'to become liquid' = 'the process experienced by a patient that becomes liquid' "jazmino" = 'uniform', -> "jazmipada" = 'blending', 'homogeneous' 'homogenization' "menafisi" = 'to avoid' -> "menafipada" = 'avoidance' English examples: destruction, immigration, dying, movement, giving, inclusion, closure, calculation, aging, etc. Note that these derivations are all inherently 'mass' nouns. Thus, once we have derived a word for the process, we can create instances or individual acts of the situation using the 'count' classifier. In other words, when a count or group CCM is applied to a verbal stem (as opposed to a basic noun stem), it will indicate a proto- typical event or group of such events. Here are some English examples: "class, lesson or teaching session" from "teaching" "act of destruction" from "destruction" "study session" from "studying" "song" from "singing" "action" from "doing" "donation" from "giving" "a/the death" from "dying" "a dance" from "dancing" "a snore" from "snoring" "a quarrel" from "quarreling" "a play session" from "playing" And so forth. Using the group CCM, we can derive useful words such as "course" from "teaching", "songfest" from "singing", etc. When used as a root, "-pa-" will have the default class A/P-d, and the meaning of the root will be 'Agent causes patient to undergo the process associated with F'. Thus, the verb "pasi" means simply 'to process' or (colloquially) 'to handle or deal with', as in "John processed the apples before he processed the pears". The A/P-s form "pazoyasi" would be equivalent to English 'to work on'. The F-s [-P] form "pamadeda" means 'process' or 'operation'. The F-d [+A] [+P] open noun "pakojaugiu" means 'the processing of P by A'. And so on. -mante- Process Result or Product CCM Dynamic verbs indicate that a change of state occurs, and it is often useful to have a word that describes the end result or product of the process. Static verbs can also have identifiable products. In the sample language, we will use "-mante-" to indicate these products. (For "-p" verbs, it will indicate that the result is only potential or attempted.) Here are a few examples: "benzosi" = 'to close' -> "benzomanteda" = 'closure' "teyodosi" = 'to learn' -> "teyodomanteda" = 'erudition' (i.e. the RESULT of learning - NOT the process itself.) "jazmino" = 'uniform', -> "jazmimanteda" = 'blend', 'homogeneous' 'uniform mass' Here are some English examples: to damage -> (the) damage to break -> breakage to block -> blockage to devastate -> devastation to stink -> stench to die -> (the) death to stop -> stoppage Be careful not to confuse the process with the result. English sometimes uses the same word for both concepts (e.g. "destruction", "education", "distribution", "closure", "death", etc.). You can test the semantics by creating a phrase such as "the resulting X". If it makes sense, then it indicates the result rather than the process. When used as a root, "-mante-" will have the meaning 'patient is the result or product of a process associated with the focus', and the default class will be P/F-d. When the focus is a basic noun (as opposed to a verbal derivation), the root will indicate that the patient is a derivative/derivation of the focus. Thus, the word "mantesi" means 'to result from', 'to be due to', 'to come from', 'to be a derivative of', etc. The noun "manteda" means 'result, product, effect, aftermath, outcome, derivative/derivation, or consequence'. The adjective "manteno" means 'resulting', 'ensuing', 'resultant', 'consequent', etc. -xa- Genitive CCM This CCM will convert any noun into its genitive or possessive counterpart. Thus, it will correspond to the apostrophe-s in English possessive constructions, the preposition "of" in English or "de" in French, the particle "no" in Japanese, the suffix "-de" in Chinese", and the simple juxtaposition of two nouns in languages such as Indonesian and Cambodian. Here are a few examples: "teyokoda" = "teacher" "teyokoxano" = "teacher's" e.g. This is the teyokoxano pencil. This is the teacher's pencil. "teyokoxada" = "teacher's (item/thing)" e.g. This is the teyokoxada. This is the teacher's. "teyokoxasi" = "to be the teacher's" e.g. The pencil teyokoxasi. The pencil is the teacher's. [Note that "-xa-" changes the verb class to P-s.] "guasuda" = "duck" "guasuxano" = "duck's" e.g. The duck's food is over there. The "-xa-" CCM has the same weak sense of association as the generic P/F-s open adjective "mabie", and may be used in its place. In effect, the CCM "-xa-" is a shortcut for the open noun construction. Thus, "teyokoxano pencil" is synonymous with "pencil mabie teyokoda". When used as a root, "-xa-" will represent the basic genitive relationship of 'having' or 'possession', and will be P/F-s by default. Thus, the verb "xasi" means simply 'to have'. As we will see later, this is not the only way to derive this meaning, but use of "-xa-" in root position will provide other advantages. -tu- Reflexive CCM In a reflexive construction, an argument is marked as being identical to the subject of the verb. Most reflexive constructions in English use the morpheme "self" to mark this function. In the lexical semantic system we are discussing here, this function is often performed by deriving a verb whose subject is AP. For example, the verb "to kill" is an A/P-d verb, while the AP-d version means 'to kill oneself or commit suicide'. There are situations, however, when we must reflexivize a focus, creating subjects that are either PF, APF, or AF. The reflexive CCM "-tu-" will allow us to do this. Here are some examples: P/F-s "menasi" = 'to be away from' A/P/F-s "menatuesi" = 'to keep (something) away from (somewhere) AF/P-s "menatuetusi" = 'to keep (something) away', 'to hold at bay' AF/P-d "menakotusi" = 'to send away', 'to cause to leave' Here are a few English examples: AF-p [+P] 'to apologize' (earlier, we discussed the derivation of this verb in detail, in the section on derivations using action concepts.) AF-d [+P] 'self-explanatory', 'obvious' AP/F-s 'to accompany' P/F-s 'to be with' AF/P-s 'to bring/take along' APF-s 'self-admirer' and 'self-admiration' 'self-contempt' Note that, in all cases, X/Y becomes XY and X/Y/Z becomes XZ/Y (NOT XY/Z!). There is never a need to go from X/Y/Z to XY/Z since this capability is already available as a basic verb derivation. We can use the generic noun "tuda" to represent English words such as "myself", "themselves", etc. when we wish to create a stand-alone reflexive. (This is actually closer to the Japanese "jibun", since it does not indicate person or number.) Here are a few examples: He killed tuda = He killed himself. I saw tuda in the mirror = I saw myself in the mirror. To precisely indicate person and number, "-tu-" can be added to an appropriate personal pronoun, as is done in English, although this is never necessary since the result must ALWAYS refer to the subject of the verb. [We'll discuss the derivation of personal pronouns later, in the section on _deictics_.] The adjective form "tuno" can represent the English word "own", as in the following: He killed tuno mother = He killed his own mother. I wanted tuno business = I wanted my own business. OR = I wanted a business of my own. Reflexive "-tu-" is unusual among voice-changing CCMs in that it actually provides the argument for the slot it fills; i.e., the argument it represents is identical to the subject. Because of this, we are adopting the convention that when "-tu-" is part of the verb or modifies the verb as an adverb, it will fill the appropriate argument of the verb and delete it from the argument structure. When used as other parts-of-speech, it will refer back to the subject, but will not actually delete the argument slot of the verb. Thus, the following two sentences are identical in meaning: He killed tuda = He killed himself. He killed tupe = He killed himself. But the following are NOT identical (where 'to teach' = A/P/F-d): He taught tuda John = He taught himself about John. He taught tupe John OR He taught John tupe = He taught John about himself. [The syntax of the AL will determine which word order is appropriate for the second example. In the sample language, either order is acceptable.] In other words, "tuda" fills whatever slot it appears in without changing the argument structure of the verb, while "tupe" always deletes and replaces the LAST argument, no matter where it appears, exactly as if "-tu-" had been affixed directly to the verb. Finally, English often uses "self" in ways that are not truly reflexive. For example, words like "self-discovery" and "self- satisfaction" are essentially idiomatic, and the reflexive CCM "-tu-", does NOT capture these meanings. Others, such as "self-ignition", imply that something happens automatically, with NO apparent agent. These can be implemented using the basic P-d version of the verb. Also, expressions such as "he himself" are emphatics - and NOT true reflexives. [We'll discuss how to derive emphatics later.] -bo- Reciprocal CCM In a reciprocal construction, the subject performs the roles of both subject and object. Most reciprocal constructions in English use a plural or compound subject and the phrase "each other" as the object, as in "They punched each other". Some verbs, however, are inherently reciprocal, and we will use the "-bo-" CCM to create them. Thus, this CCM will change the argument structure of a verb from X/Y-x to X=Y-x or from X/Y/Z-x to X=Y/Z-x. (Note the use of "=" in the notation "X=Y-x". This is necessary since the semantics of reciprocal XY is different from the semantics of normal XY.) Here are some examples: P/F-s: "masi" - 'to be in a relationship with', 'to be involved with', 'to have something to do with' P=F-s "mabosi" = 'to mutually/reciprocally experience an unspecified state', 'to have an unspecified association/relationship with each other' "maboda" = 'correlatives' (i.e., things which have an unspecified relationship or association with each other) "mabovesi" = 'association', 'relationship', ["-ve-" = quality CCM] "mabopasi" = 'a mutual/reciprocal experience' ["-pa-" = process CCM] "mabopano" = 'mutual/reciprocal' P/F-s "mesi" = 'to be located at/in' P=F-s "mebosi" = 'to be together' "mebope" = 'together' P/F-s "menasi" = 'to not be located at', 'to be away from' P=F-s "menabosi" = 'to be apart' "menabope" = 'separately' P/F-d "medosi" = 'to come to' P=F-d "medobosi" = 'to come together (locative sense)', 'to meet' P/F-d "menadosi" = 'to become located away from" P=F-d "menadobosi" = 'to come apart' AP/F-d "mesuasi" = 'to reach/arrive at' AP=F-d "mesuabosi" = 'to get together' "mesuabopada" = 'a get together' AP/F-d "menasuasi" = 'to leave/depart', 'to go from' AP=F-d "menasuabosi" = 'to split up' Here are a few English examples: AP/F-d 'to join' AP=F-d 'to gather/assemble', 'to come together' A/P-p 'to argue/quarrel with' A=P-p 'to argue/quarrel' A/P/F-p 'to speak about' A=P/F-p 'to converse/talk about', 'to have a conversation about' AP/F-s 'to agree/disagree with' AP=F-s 'to agree/disagree' We can use the generic noun "boda" to represent the English phrase "each other" or "one another" when we need to apply the concept in a non-verbal form. For example, "They stole each other's money", where "each other's" would be either "boxano", or "mabie boda". Note that "-bo-" is like reflexive "-tu-" in that it actually provides the argument for the slot it fills. Because of this, we can use it as an affix or in root position in exactly the same way we used "-tu-". Thus, "boda" will fill whatever slot it appears in without changing the argument structure of the verb, while "bope" will delete and replace the second argument, no matter where it appears, exactly as if "-bo-" had been affixed directly to the verb. Finally, it may also be useful to have a reciprocal CCM that equates the patient and the focus of an A/P/F verb; i.e. A/P/F -> A/P=F. For this purpose, I will use the CCM "-pasku-". For example, the generic A/P/F-d verb "kosi" meaning 'to change something with respect to something else', would become "kopaskusi" meaning 'to re-arrange'. At the moment, though, I can't think of any other examples. -vua- CCM to make all arguments of a verb oblique It will be useful to have a CCM for a voice change that strips all arguments from a verb but makes them obliquely expressable. This CCM will be most useful in various kinds of greetings. Here's an example (we will use the speech act root "-jandoya-" to represent the meaning 'congratulate', default class A/P/F-p): A/P/F-p: "jandoyasi=jandoyaniosi" = 'to congratulate on' e.g. I congratulated him on his victory. 0-p [+A] [+P] [+F]: "jandoyavuasi" = 'congratulations' e.g. Congratulations! Congratulations from all of us! Congratulations to the newlyweds! Congratulations on your great success! where the case tag "from" introduces an oblique agent, "to" introduces an oblique patient, and "on" introduces an oblique focus. Note that, although "congratulations" is morphologically a noun in English, it is clearly being used as a speech act verb. This CCM would also be useful with verbs such as 'to greet', 'to bless/curse', 'to shame', 'to thank', 'to apologize', 'to welcome', etc. 5.1 THE GENERAL NEGATOR "-na-" The morpheme "-na-", meaning 'not' or 'other than' is considered a class- changing morpheme because it creates an entity or state that is completely different from what it modifies. Thus, it can change the entity or state so much that it is no longer in the same class (cf. 'mammal' vs. 'non-mammal'). Now, to be truly useful, we must also be able to modify concepts that are more complex than simple roots. For example, 'to become not-at-a-place' and 'not to become-at-a-place' are completely different concepts. The first means essentially 'to leave' while the second means 'not to arrive'. Here's a more complete example: AP/F-d verb "me-sua-si: to cause oneself to become 'at' a location = to arrive AP/F-d noun "me-sua-da": one who causes himself to become 'at' a location = arriver Verb with negated root "me-na-sua-si": to cause oneself to become (other than 'at') a location = to depart/leave Noun with negated root "me-na-sua-da": one who causes himself to become (other than 'at') a location = departer Verb with negated root+classifier "me-sua-na-si": other than (to cause oneself to become 'at' a location) = not to arrive Noun with negated root+classifier "me-sua-na-da": other than (one who causes himself to become 'at' a location) = non-arriver In the last two examples, we modified the combination of root PLUS classifier (normally referred to as a _stem_). Thus, since the morphology of the sample language is right-branching, the modifier had to be placed after the classifier. For basic nouns, modifying the root BEFORE applying the classifier can result in a noun that has nothing in common with the unmodified noun. Here are some derivations that illustrate this point: guamoda -> "water mammal" -> cetacean guasuda -> "water bird" -> duck guanamoda -> "non-water mammal" -> camel (???) guanasuda -> "non-water bird" -> vulture (???) guamonada -> non-cetacean guasunada -> non-duck guanamonada -> non-camel guanasunada -> non-vulture Now, let's do some additional derivations using "-na-": P/F-s "lundape" = 'before' (case tag) "lundanape" = 'at or after' P-s "lundasepe" = 'already', 'earlier', 'in the past' "lundanasepe" OR "lundasenape" = 'not earlier = now and later', 'from now on' A/P-d "benzosi=benzopusi" = 'to close' "benzonapusi" = 'to open' [Note that "-na-" must precede the classifier to obtain the sense 'to cause to be not-closed'. If it followed "-pu-", it would simply mean 'to not close' or 'to do something other than close'.] P-s "guaseno" = 'liquid' (adjective) "guanaseno" OR "guasenano" = 'non-liquid' P-s "veyaseno" = 'real', 'existent' "veyanaseno" OR "veyasenano" = 'nonexistent', 'fictitious/imaginary', 'not real' P-d "veyanapiano" = 'extinct', 'no longer real or existent' P-d [+A] "veyanapununo" = 'annihilated', 'wiped out' A/P-s "zoyasi" = 'to keep/maintain' "nazoyasi" = 'to prevent/preclude' Basic Noun "zioda" = 'insect' "zionada" = 'non-insect' (adjective) And so forth. Note how "-na-" can be used either after the root or after the classifier in the "-s" verbal derivations with little change in meaning. This should not be surprising since the "-s" derivations are just morphologically complete versions of the state concept. However, in the "-d" and basic noun derivations the placement of the modifier is critical. Still, even in the "-s" derivations, the placement of the MCM does have an effect on the meaning. Consider the following: guanasesi -> to be non-liquid guasenasi -> to not be liquid If the root is modified, it implies that the INHERENT state is negated. If the root+classifier is modified, it implies that the CURRENT state is negated. Thus, modifying the root indicates a more permanent or natural condition, while modifying the root+classifier indicates a temporary and potentially changeable situation. Thus, the derivations using the root "-veya-" meaning 'real or existent' are more accurately glossed as: P-s "veyaseno" = 'real', 'existent' "veyanaseno" = 'fictitious/imaginary', 'unreal' "veyasenano" = 'nonexistent', 'not (currently) real' Thus, we may use the word "veyanaseno" for a unicorn, and the word "veyasenano" for a dinosaur, but not vice-versa. When used in generic state verbs, "-na-" will imply a non-relationship for "-s" verbs and a breaking-off of a relationship for "-d" verbs. For example, the P/F-s verb "masi" means 'to have something to do with'. Thus, "namasi" means 'to have nothing to do with'. The AP/F-d verb "suasi" means 'to change' in the sense that the subject is entering a new relationship with the focus. Thus, the negative form "nasuasi" has the sense 'to break off', 'to leave', 'to give up', 'to quit', and so on, as in "He gave up his membership" or "He left the partnership". Keep in mind that generic verbs do not specify the nature of the relationship. Thus, the unspecified relationship can be ANYTHING: social, mental, locative, or even temporal. [Incidentally, English seems to have many periphrastic, metaphoric, and even idiomatic expressions that cover the meaning of "nasuasi". However, I was pleased to recently discover that another language, Swahili, has the single verb root "acha" that seems to precisely cover all of the semantic space of the verb "nasuasi". It is also an extremely common word in Swahili, and is used in many compounds and periphrastic expressions.] 7.0 ROOTS AS MEANING-CHANGING MORPHEMES Most roots represent basic states, and it is often useful to allow a state to modify another state or entity. For example, if the root meaning 'like/ similar' could be used to modify the basic noun meaning 'duck', we could create a word meaning 'duck-like'. Fortunately, our morpho-semantic system is robust enough to allow roots to directly modify other roots. When used in this way, we will refer to the root as a _meaning-changing morpheme_ (henceforth MCM). Furthermore, any morpheme that is not being used as a root will be referred to as a _modifying morpheme_. Thus, modifying morphemes will include all roots that can be used as MCMs, all classifiers, and all CCMs. [Note that we do not consider terminators to be true morphemes, since their function is more syntactic than semantic. While this distinction may not be linguistically "correct", it is useful for our purposes.] Important Note: I will be using the abbreviation "MCM" extensively throughout the remainder of this monograph. Note that this definition is a functional one, not a morphological one. In effect, an MCM is any morpheme that is used to modify what it is suffixed to. To reflect this new usage, the morphology of the sample language will be simplified as follows: Word ::= { Morpheme } + Part-of-speech Morpheme ::= C D | C V { X } Part-of-speech ::= Terminator Terminator ::= da | di | giu | nia | bie | no | pe | si C = any consonant (p, b, t, d, k, g, c, j, l, m, n, f, v, s, z, x) D = any diphthong (ai, au, eu, ia, ie, io, iu, oi, ua, ue, ui, uo) V = any vowel (a, e, i, o, u) S = any semi-vowel (w, y) X = extension = S V | C C V | = logical 'or' {} = enclosed item may appear zero or more times Lower case letters represent themselves Note that the morphology no longer makes a distinction between roots, classifiers, and CCMs. Any morpheme or combination of morphemes may appear before a terminator, as long as the result is semantically acceptable. And, as we will discuss in the next section, it is even possible for terminators to appear alone. Now, let's look at some morphemes that can be used as either roots or MCMs. We already met one such root when we derived the manner case tag. This root was "-lo-" with the meaning 'like' or 'similar'. As an MCM suffixed to basic nouns, the adjective form will represent the English suffix "-like", as in "house-like", "ridge-like", or "bird-like", while the noun form would be equivalent to the English prefix "quasi-". When used with verbs, it will represent the English expression "sort of" or "kind of", as in "He kind of hopped into the house" or "He sort of cleaned his bedroom". Another useful MCM can represent the concept behind the English expressions "back" or "in return", as in "He shouted back at me" or "I gave her a kiss in return". For transitive verbs, it implies that a previous situation existed in which the current agent and a current non-agent had reversed roles. (The actual non-agentive argument is determined from context.) For intransitive verbs, it implies that the patient is returning to an earlier state. These derivations are probably not going to be useful with "-s" verbs. In the sample language, we will use the MCM "-sifne-". Here are some examples: A/P-d: "benzosi" = 'to close' "benzosifnesi" = 'to close in return' (E.g. "He closed my window so I closed his in return.") A/P/F-p: "jandoyasi" = 'to congratulate (on/for)' "jandoyasifnesi" = 'to congratulate back/in return' English examples: to give back to, to return to to shout back at to do (someone) a favor in return to throw (something) back to to come back to etc. As a verb root, "-sifne-" represents the action concept 'to do in return' (default class = AP-s). Thus, we can create the AP-s verb "sifnesi" meaning 'to reciprocate', the AP/F-s verb "sifnefisi" meaning 'to reciprocate by', and the stand-alone adverb "sifnepe" meaning 'back' or 'in return'. Note, though, that some English verbs are inherently reciprocating, as in "I returned (= gave back) the book to Joan". In the sample language, these verbs would be created by appending "-sifne-" after the verb classifier. Do not confuse "-sifne-" with the English word "again", which simply means 'one more time'. Note that "-sifne-" is similar to the reciprocal CCM "-bo-", which we discussed earlier, but is different in two important ways: first, it does not change the argument structure of the verb; and, second, it implies a sequential event, whereas the CCM "-bo-" implies simultaneity. Thus, the MCM changes the meaning of the word while the CCM changes the argument structure and provides a semantic link between the arguments. We can also create a useful combination of the two morphemes "-veya-", meaning 'real/existent', and "-na-", the basic negator. The result, "-veyana-", can be used as a suffix meaning 'not real' and is equivalent to the English prefix "pseudo-". Here's a short list of some more examples of concepts that could be easily implemented using additional root/MCMs: Concept English examples ------- ---------------- 'male' -> "stallion" from "horse" 'female' -> "mare" from "horse" 'neuter' -> "gelding" from "horse" 'artificial' -> all man-made-X's The list is intentionally short because I would like to discuss most of the important modifiers in greater detail and in the appropriate semantic context. I will do this in several sections of the remainder of this monograph. 8.0 SIMPLE GENERICS The simplest possible generic derivation consists of just a terminator. By its very nature, a pure generic like this can encompass any or all possible referents. Thus, when a speaker uses such a generic, he is implying that he either doesn't know the specific referent or is not willing to divulge it. In effect, the simplest generics perform a function that is similar to the impersonal constructions of English and other natural languages. Here are some of the more useful ones: Generic noun "da" - 'someone', 'something(s)', 'they' e.g. SOMEONE broke the window. THEY don't make cars like THEY used to. Billy just broke SOMETHING. Generic adjective "no" - 'some (kind of)' e.g. She's SOME KIND OF fortune teller. SOME jerk just blocked my car. Generic verb "si" - 'something's going on', 'something happened' e.g. SOMETHING'S GOING ON here. If he persists, SOMETHING's bound TO HAPPEN. [Note that since "si" does not specify an argument structure, it cannot have ANY core arguments, and may stand alone as a complete sentence. All arguments, if any, must be oblique.] Generic adverb "pe" - 'just...period', 'that's all', verb terminator (but without the implied rudeness or abruptness of the English expression 'just...period') e.g. I don't know what happened. He JUST left, PERIOD. [Here, "pe" simply indicates that more arguments of the verb are available (either core or oblique), but that the speaker either doesn't know or is not willing to divulge them.] Generic previous-word modifier "di" can be used to modify adjectives and adverbs. It would have meanings such as 'somehow', 'in some way or other' e.g. She was a pretty girl, SOMEHOW OR OTHER. He seemed to be an important person, SOMEHOW. [Here, English requires that "somehow (or other)" be a sentential adverb. In reality, it modifies the adjectives "pretty" and "important".] If we apply the negating CCM "-na-", the results are also very useful: "nada" - 'nobody', 'no one', 'nothing' "nano" - 'no' e.g. He's NO fool. "nasi" - 'nothing's going on', 'nothing happened' "nape" - similar to "pe", but emphasizes that there is NOTHING more to be said. "nadi" - 'nohow', 'in no way' 8.1 SLOT-FILLERS AND TERMINATORS In our sample language (as well as in many natural languages), verbs are marked to explicitly show their argument structure. Thus, for instance, a speaker will not normally use a verb that takes a focus unless he plans to provide a focus. In English, however, objects are often omitted, as in the following: John is eating vs. John is eating a sandwich. Bill told a joke vs. Bill told the kids a joke. There will be times, though, when a speaker wishes to emphasize that an argument is being intentionally omitted. In our sample language, the basic generics "pe" and "da" allow us to do this. The generic "da" fills a SINGLE empty slot in the argument structure of the verb, while generic "pe" completely terminates the verb, allowing no more arguments, even oblique arguments. English impersonal pronouns are equivalent to "da", and the English expression "that's all" captures some of the meaning of "pe". Incidentally, some natural languages achieve a similar termination effect by using explicit open/close morphemes that are very reminiscent of parentheses. Here's an example from Malagasy: ity trano fotsy ity this house white this 'this white house' There are also many languages, such as Persian (Iran), Yoruba (West Africa), and Hewa (Papua New Guinea), that bracket their relative clauses with explicit start and end morphemes. Although this may seem unnecessary and even redundant, it can be useful at times to prevent ambiguity. In our sample language, however, there is no need to create such morphemes, since "da" and "pe" can be used as terminators when needed. 9.0 POLARITY This section is basically about the derivation of words that are most commonly referred to as "opposites" or "antonyms". However, I'm going to extend the semantics somewhat to make the system as productive as possible, and will use the term "polarity" to refer to the semantics of this system. We've already had some exposure to the concept of polarity when we used the CCM "-na-", meaning 'not' or 'other than'. This morpheme can, in fact, be used to create true opposites, but only for state concepts that can have only an "either/or" interpretation. I will refer to these as _binary_ states. Here are some English examples: open -> close = become 'not open' attach -> detach = become 'not attached' recall -> forget = become 'not in-memory' enter -> exit = become 'not inside' zip -> unzip = become 'not zipped' real -> imaginary = be 'not real' same -> different = be 'not the same' In other words, for binary states, anything that is 'not X' is by definition 'the opposite of X'. And since the CCM "-na-" basically means 'other than', it can also be used to create other kinds of binary oppositions, such as "non-American", "non-mammal", "non-believer", and so on. In effect, they contrast members of a well-defined group with everything that is not part of the group. (In fact, the result may not even be in the same class, which is why "-na-" is a CCM rather than an MCM). In other words, these constructions do not tell us what their referents ARE - they only tell us what their referents are NOT. However, this distinction is not useful for our purposes, and we will treat all of the above oppositions as 'binary'. When suffixed to state verbs (AFTER the classifier), the CCM "-na-" indictates that the state does not currently exist or apply, and often has the sense of the English suffix "-less". For example, "living" + "-na-" = 'lifeless', "weighing" + "-na-" = 'weightless', and so on. Also, note in the previous paragraph that the word "non-believer" is derived from the focused state concept; i.e., from the P/F-s verb. In order to obtain the concept of 'beliefless', as in "a beliefless individual", we would start with the P-s verb. However, the English morpheme "-less" can also be used to mean 'not possessing' or 'bereft of', as in "penniless". The CCM "-na-" does NOT have this sense! In addition to binary concepts, there are concepts that can cover a range of states, such as 'torrid/hot/warm/lukewarm/cool/chilly/cold/frigid'. These concepts are _scalar_ and can take on more than two values. However, natural languages almost never make minor distinctions such as between "cold" and "cool" or between "warm" and "hot" with completely different words. Instead, modifiers are normally used, as in "heavy" vs. "very heavy" vs. "not too heavy", etc. Also, when a language does make such a distinction using unique words, it is rare to find other languages that make the same distinction. For example, the Arabic word "baarid" can mean either 'cool', 'chilly', OR 'cold'. Expressions meaning 'very', 'not too' and so on are used to provide greater detail when needed. Summarizing the above, there are basically 2 types of opposites: 1. Binary opposites Examples: real vs. imaginary closed vs. open Chinese vs. non-Chinese member vs. non-member 2. Scalar opposites Examples: hot vs. warm vs. cool vs. cold gigantic vs. large vs. small vs. tiny bright vs. light vs. dim vs. dark vs. pitch black We can conceptualize a binary concept as consisting of two opposing semantic spaces. Anything that is not in one MUST be in the other: --------- --------- | | | | | + | | - | | | | | --------- --------- The 'sizes' of the above semantic spaces can be different, with one often being significantly 'larger' than the other. For example, consider a closed door versus a fully open, partly open, or slightly open door. A scalar concept consists of unique, non-overlapping semantic spaces on a line: ------------------------------------------------------------------- | | | | | | | | ... | +3 | +2 | +1 | 0 | -1 | -2 | -3 | ... | | | | | | | | ------------------------------------------------------------------- Here, the "0" position could be paraphrased as "neither X nor Y", as in "neither hot nor cold". As we've already seen, binary opposites can be implemented with the CCM "-na-". For scalar opposites, however, we cannot use "-na-" since it means 'other than'. For example, "other than hot" does not necessarily mean 'cold'. It could also mean 'scalding', 'warm', 'lukewarm', 'cool', and so on. In other words, it means anything outside the range of temperatures indicated by the word "hot". So, how do we deal with scalar states? In the sample language, we will create four special MCMs that can provide additional detail. These will hardly ever be needed, since most people will prefer to use external modifiers such as "very", "not too", "hardly", etc. But there will be times when these concepts will be needed in word derivations. Here are the MCMs that we will use in the sample language: -pi- 'maximally', 'extremely' -ge- 'very', 'highly' -so- 'not too', 'not very' -ju- 'minimally', 'barely', 'hardly' In addition, we will assume that the semantic space of "-pi-" is a subset of the semantic space of "-ge-", and that the semantic space of "-ju-" is a subset of the semantic space of "-so-". We'll see examples of this below. Note that these are MCMs, not CCMs like "-na-"! The result is always within the same class as the item that is modified. Using the above, we could start with the words "xauno" meaning 'hot' and create words such as "xaupino" = 'torrid', "xausono" = 'warm', and "xaujuno" = 'lukewarm/tepid'. The words meaning 'cool/cold/frigid/etc' would be derived in the same way using a different root. However, as I stated above, natural languages hardly ever create distinct words to represent such concepts, depending instead on external modification. To make matters worse, the derivations may only be approximate. For example, we could also gloss "xaupino" as either 'scorching', 'blistering', or 'scalding', but these all have implications beyond basic 'hotness', since they imply manner as well as degree of heat. Actually, the gloss 'torrid' is also inappropriate, since it has connotations of both 'dryness' and 'climate'. Keep in mind, though, that this lack of precise English counterparts is not a problem at all. As long as the semantics of the derivations are precise, there will never be any doubt about their meaning, even though a particular derivation may not have an exact counterpart in a particular natural language. As I mentioned earlier, it is almost always impossible to find exact matches for a word in different languages. Also, the above derivations are actually more useful than the English counterparts, since they are slightly more general and can be used in more contexts. Specific implications such as 'climate' or 'dryness' are almost always obvious from context. Here are some useful examples derived from the P-s adjective "tencino = tenciseno", meaning 'intelligent: tencipiseno = 'genius' tencigeseno = 'brilliant' tenciseno = tencino = 'intelligent', 'smart' tencisoseno = 'slow', 'dense', 'obtuse' tencijuseno = 'stupid', 'dim-witted', 'retarded' tencinaseno = 'non-intelligent', 'lacking intelligence' tencisenano = tencinano = 'not intelligent', 'not smart' Note that someone who is 'genius' is also 'brilliant', but someone who is 'brilliant' is not necessarily 'genius'. Thus, "-pi-" derivations are a subset of "-ge-" derivations. For the same reasons, "-ju-" derivations are a subset of "-so-" derivations. Now, what happens if we apply a scalar MCM to an inherently binary state. In my opinion, the following rule is the most consistent and productive: The scalar polarity MCMs force a binary state to become scalar. The resulting derivation indicates a more precise position within the semantic space of the scalar concept. To illustrate the rule, consider the following derivations using the root "xoya-", meaning 'alive' (default class = A/P-d): P-s "xoyasesi" = 'to be alive', 'to live' "xoyaseno" = 'alive/living' "xoyanaseno" = 'non-living' (inherently) "xoyasenano" = 'not living', 'lifeless' (currently) P-d "xoyanapiasi" = 'to die' "xoyanapiano" = 'dead/deceased' A/P-d "xoyanasi=xoyanapusi" = 'to kill' "xoyasepino" = 'vibrantly alive' "xoyasejuno" = 'minimally alive', 'very close to death' "xoyanapiapino" = 'maximally dead', 'dead as a doornail' "xoyanapiajuno" = 'minimally dead', 'barely over the line' For another example, consider the 'open/close' distinction, where the basic root has the binary state meaning 'closed'. If the "-na-" CCM is used, we create a word with the prototypical interpretation of 'open'. Now, if we apply the scalar MCMs, the 'open' state is extended to cover the range from 'fully/ wide open' to 'slightly ajar'. Here are more examples that illustrate all of the above: P/F-s "lono" = 'alike', 'similar' "lopino" = 'identical', 'exactly alike' "lonano" = 'dissimilar/different' "lonapino" = 'truly distinct', 'completely different', 'having nothing in common' Compare it with another relational state meaning 'same' or 'equal': P/F-s: "kapsusi" = copula 'to be' "kapsuno" = 'same/equal' "kapsunano" = 'not the same', 'unequal', 'different' Here's another example, using a basic noun: A/P-d "paipusi" = 'to energize/activate/turn on' But how do we create it's obvious and very useful opposite meaning 'to shut off' or 'to de-activate'? The word "paipunasi" would mean that energy was not being applied; i.e., 'to not turn on'. In fact, "paipunasi" could also imply that something else was being done to the patient INSTEAD OF turning it on. Because of this 'other than' sense, "-na-" is not going to be very useful when it follows the combination noun classifier + verb classifier. Still, it will have some uses. So, what about "painapusi"? This could be paraphrased as 'to apply non- energy to the patient', which is totally useless. So, instead, why don't we re-define the use of "-na-" when applied to basic nouns that undergo additional verbal derivation. The new semantics would be as follows: When a basic noun undergoes verbal derivation, the entity is 'applied to' or 'present in' the patient. When a basic noun modified by "-na-" undergoes verbal derivation, the entity is 'removed from' or 'lacking in' the entity. Thus, we can now create a word meaning 'to turn off' or 'to de-energize': A/P-d "painapusi" = 'to de-energize/de-activate/turn off' Here are some more examples from basic nouns: Noun "tencivauda" = 'brain' A/P-d "tencivaunapusi" = 'to lobotomize', 'to remove brain matter from' P-s adj "tencivaunaseno" = 'lacking brain matter', 'brainless' "tencivaunasepino" = 'completely brainless (literally)' Noun "tencidengida" = 'mind (sentient)', 'intellect' P-d "tencidengipiasi" = 'to become sentient', 'to get a mind' P-d "tencidenginapiasi" = 'to become non-sentient', 'to lose one's mind' A/P-d "guafapusi" = 'to water', 'to add water to' "guafanapusi" = 'to dry', 'to dehydrate', 'to remove water from' P-s adj "guafanaseno" = 'dry', 'waterless' A completely different kind of opposite can be derived by means of the inverse grammatical voice change. These words will all be derived from P/F-s state verbs, since they indicate a relationship between two entities. Here are some examples: Active Inverse -------------- --------------- to be parent of -> to be child of to be sibling of -> to be sibling of (NO meaningful inverse) to own -> to belong to to enclose -> to be inside of to be above -> to be beneath/below Opposites of this type are normally referred to as _converses_. And, as we saw earlier, inversion can also be useful for deriving other kinds of relationships from non-P/F-s verbs (e.g. 'teacher/pupil' and 'student/ subject'.) Note that with the 'male/female' MCMs we discussed earlier, we can create words for 'mother', 'father', 'son', 'daughter', 'brother', and 'sister' from the roots meaning 'parent' and 'sibling'. If we create MCMs meaning 'same sex as referent' and 'different sex from referent', we can derive words equivalent to some of the kinship terms of Austronesian languages such as Hawaiian and Malagasy. For example, we could create the equivalent of the Hawaiian word "kaikaina" meaning 'younger sibling of the same sex' from the root for 'sibling' plus the MCMs meaning 'younger' and 'same sex as referent'. Thus, "his X" would mean 'his younger brother', while "her X" would mean 'her younger sister'. In the same way, other MCMs could be created to allow derivation of the kinship terms of any natural language. When the semantics of a relationship are truly in balance, as in the parent/ child example, it is impossible to tell which is patient and which is focus. In these cases, it doesn't really matter which is active and which is inverse. In other words, we can't make a meaningful semantic distinction by making one patient and the other focus - all we can do is make a TOPICAL distinction. We might, however, adopt rules for the sake of consistency. For example, active forms could be used for the polarity that is inherently greater in magnitude or more positive in outlook (such as 'older', 'wiser', 'bigger', 'better', 'sentient', etc.). The inverse forms would be used for their counterparts. Thus, 'parent' would be basic, and 'child' would be derived from it. [This, of course, is essentially what we did above with binary states. Note that the inverse relationship is inherently binary.] 10.0 COUNTS AND MEASURES Counts (also called _quantifiers_) and measures are inherently stative because they provide more information about an entity. Consider the following: He saw students. He saw tall students. He saw three tall students. He saw three 6-foot tall students. Each use of a count or measure reduces the number of possible referents, just as if they were adjectives. Thus, counts and measures are inherently stative - they just happen to be quantitative rather than qualitative. 10.1 IMPLEMENTING COUNT WORDS In light of the above, it might seem most appropriate to define P-s state verbs, one per digit, which can then be combined in some way to form larger numbers. However, I feel that this is not the best approach, for three reasons: 1. Large numbers (e.g. 3214) indicate single quantities or 'states', but would require many words to implement. In general, a single quantity, whether 'six' or 'nine-hundred-and-seventy-six', should ideally be implemented as a single word, if only because it is actually USED as a single word. 2. The syntax/semantics interface for numbers would be difficult (and perhaps impossible) to design in a way that is consistent with the remainder of the system. For example, how do you combine the words for 5, 100, 6, 10, and 7 to create the number 567? Should we simply link adjective forms? Or should adjective, noun, and perhaps verb forms be combined? Can it be done in a way that is consistent with the way adjectives, nouns, and verbs are combined in non-numeric modification? Can the result conform to a self- segregating morphology? And can it undergo further derivation? Personally, I don't think it's possible without extreme complication and without the adoption of ad hoc rules that apply only to numbers. 3. The resulting numbers would be very long - much longer than is common in natural languages. Instead, I am going to suggest a simple system that creates a single, efficient word for any quantity, regardless of size. To accomplish this, we will allocate several numeric roots that will also function as MCMs. These root/ MCMs can be combined to represent the actual quantity. There will also be morphemes to represent ordinality, radix, a minus sign, a decimal point, an exponent, and so on. In our sample language, a basic number will have the following format: ( radix ) + ( minus sign ) + [ digit ] + ( decimal point + [digit] ) + ( exponent + (minus sign) + [digit] ) + ( ordinality ) + part-of-speech where [] indicates one or more of the enclosed item () indicates that the enclosed item is optional Here are the number-forming morphemes: -minsu- minus sign (default = positive) -zeyo- zero -fe- one -du- two -zi- three -kau- four -poi- five -bua- six -vastu- seven -ketsa- eight -go- nine -cuye- decimal point -jinta- exponent -- cardinal (this is the default) -xunga- ordinal Here are some examples: zeyono= 0 zino = 3 zixungano = 3rd dugono = 29 dugoxungano = 29th duzeyoduno = 202 duzeyozeyono OR dujintaduno = 200 du-cuye-poino = 2.5 minsudu-cuye-poino = -2.5 du-cuye-poizeyogo-jintagono = 2.509 x 10**9 Note that, since numbers are essentially P-s adjectives, we will assign P-s as the default class for numbers. (We will see later how to derive other forms.) Thus, numbers represent the state concept 'being N in number'. Hyphens can be used at appropriate spots to make the results easier to read. Simple powers-of-ten can be represented with the exponent feature or by using an appropriate number of zeros. For example, 'ten' would be "fezeyono" or "jintafeno", 'hundred' would be "fezeyozeyono" or "jintaduno", and so on. If you want more efficient results, however, you could also implement single morphemes for the more common powers-of-ten. For example: -dai- ten -senti- hundred -kio- thousand -milni- million 10 = daino 100 = sentino 1,000 = kiono 1,000,000 = milnino If you want compound numbers to more closely resemble their natural language counterparts, you can optionally use powers-of-ten, as in the following examples: 24 = dudai-kau-no (literally 'two-ten-four') 657 = buasenti-poidai-vastu-no (literally 'six-hundred-five-ten-seven') 21,007 = dudaifekio-vastu-no (literally 'two-ten-one-thousand-seven) 7,000,028 = vastukio-dudai-ketsa-no (literally 'seven-million-two-ten-eight') Numbers can also be expressed in different bases. Here are some additional morphemes that can be used for hexadecimal numbers: -heksi- hexadecimal radix (default = base ten) -maya- A hex -biwi- B hex -cawa- C hex -doyo- D hex -neye- E hex -fuyu- F hex Here are a few examples: heksi-du-zi-no = 23 hexadecimal = 35 decimal heksi-fe-maya-fuyu-no = 1AF hexadecimal = 431 decimal heksi-minsu-neye-zeyo-no = -E0 hexadecimal = -224 decimal The actual value of a numeric morpheme is never fixed, but must always be interpreted according to the current radix. In other words, the numeric morpheme simply represents a fixed string of one or more digits. Thus, the word "heksidaino" is 10 hexadecimal = 16 decimal, "heksisentino" is 100 hexadecimal = 256 decimal, and "heksikiono" is 1000 hexadecimal = 4096 decimal. As with all MCMs, the numeric morphemes can be used as either roots or MCMs. Thus, they may be used in words other than numbers. We'll see examples of this later. We can designate an additional morpheme to indicate fractions. For example, if the fraction morpheme is "-divde-" and has the meaning 'over' or 'divided by', then we can do the following: -divde- divider, X/Y fedivdeduno 'one half' dudivdegono 'two ninths' dudivdesentino 'two hundredths' = 2 percent du-cuye-poi-divdesentino 'two point five hundredths' = 2.5 percent Note that there is no need to create a separate word for 'percent', since the combination "-divde-senti-" effectively means 'percent'. As a short cut, we could also designate a unique morpheme to indicate simple 'one-over-X' fractions. For example, if this morpheme is "-fevde-", then we can do the following: -fevde- divider, 1/X fevdeduno 'one half' fevdezino 'one third' fevdevastugono 'one seventy-ninth' In effect, "-fevde-" is an abbreviation for "-fe-divde-". To handle imaginary numbers, we will use the morpheme "-vevna-", as in the following examples: -vevna- real/imaginary separator zi-vevna-duno 3 + i2 du-vevna-minsukauno 2 - i4 ducuyedu-vevna-minsuzicuyekauno 2.2 - i3.4 Additional numeric morphemes can be allocated to derive words that are quantitative, but less specific. Here they are: -saksi- all, the whole amount -mai- many, much, a lot, a large amount -xandu- not too many, not too much -pewa- few, little, a small amount -zonja- any, positive non-zero, one or more, greater than zero These can be used just like the others (although use of radix, decimal points, exponents, and so on would not make sense): I saw maino guasuda = I saw many ducks. Do you have zonjano mirrors? = Do you have one or more mirrors? = Do you have any mirrors? Sorry, we sold saksida yesterday. = Sorry, we sold all of them yesterday. I'd like pewano guafada, please. = I'd like a little water, please. There's maino soup in the pot. = There's a lot of soup in the pot. Note how, in the last two examples, less specific numerics can also be used to modify mass nouns. In fact, we can adopt the convention that the less specific numerics will have a mass interpretation when modifying mass nouns and a count interpretation when modifying count nouns. For example, "saksino" will mean 'every' when applied to count nouns and 'the entire amount' when applied to mass nouns. If we want to change the default, the appropriate CCM can be applied to the noun. For example, 'the entire duck' would be "saksino guasujazmida", where "-jazmi-" is the 'mass' CCM. The less specific numerics may also be combined with specific numbers to represent useful words and expressions: dusaksino 'both' = 'two + all' vastusaksino 'all seven' saksifeno 'each/every' = 'all + one' The non-specific morphemes will not be useful with the ordinal marker unless we adopt a different interpretation. I suggest that a non-specific numeric plus an ordinal marker be interpreted as an appropriately sized portion of the entire amount, as follows: saksixungano 'the entire', 'all of the' maixungano 'a large fraction/portion of' xanduxungano 'a small fraction/portion of' pewaxungano 'a tiny fraction/portion of', 'a modicum of' zonjaxungano 'a fraction/portion of' The ordinal marker can also be used with the 'maximal' scalar polarity morphemes "-pi-" and "-ju-": -pi- maximally, extremely pixungano final, last few -ju- minimally, barely, hardly juxungano initial, first few Other classifiers can be used to create words with different argument structures. Here are some examples (the default for all numerics is P-s): P-s: duno 'two' dusi 'to be two in number' zisi 'to be three in number' du-cuye-zisi 'to be 2.3 in quantity' P-d: dupiasi 'to become two in number' Since numerics select a subset from a potentially larger set, they imply a relationship between the subset (P) and the larger set (F). For example: P/F-s: femasi 'to be one of' (eg. "He's one of the Smith boys.") P/F-d: dudosi 'to become two of' (eg. "They just became two of our newest members.") As nouns, they represent the concept "N entities" or "an N-some": I have daino copies left = I have ten copies left. Please give me buada = Please give me six (of them). I met the zida yesterday = I met the threesome/trio yesterday. Other combinations of numeric root, classifier, CCM, and MCM can be used to produce many useful words. Here are some useful derivations from the number 'one': P-s: fesi 'to be single/one in number' P-s adj: feno 'one', 'single', 'only', 'sole' P-s noun: feda 'a unit', 'a single entity' P-s adv: fepe 'together', 'as a unit', 'in unison' fenape 'apart', 'separately' P-s quality: feveda 'unity, 'oneness' feveno 'unary', 'monadic' P-d: fepiasi 'to coalesce', 'to become one (in number)' fenapiasi 'to come apart', 'to disintegrate', 'to break/split up' P-d process: fepiapada 'coalescence' fenapiapada 'dissociation', 'disintegration' A/P-d: fepusi 'to unify', 'to integrate' fenapusi 'to separate/divide/split apart' A/P-d process: fepupada 'unification' A/P-d [+F]: fenakomiusi 'to segregate/set apart' And so on. There are many others. 10.2 IMPLEMENTING MEASURE WORDS Earlier, we discussed how the focus of basic scalar state verbs could elaborate the state, as in the following examples: Saudi Arabia is rich vs. Saudi Arabia is rich in oil. It's also possible to be even more precise, as in: John is rich vs. John is rich to the tune of 3 million dollars. Here, the argument "3 million dollars" is simply the focus of the P/F-s verb meaning 'to be rich'. In other words, any state that can have different degrees of intensity (i.e. scalar states) can be the root of a P/F-s verb that measures the degree of the state. Here are some more English examples: P-s: John is_tall. P/F-s: John is_tall 6 feet = John is 6 feet tall. P-s: The book is_heavy. P/F-s: The book is_heavy 4 kilograms = The book weighs 4 kilograms. P-s: The opera is_long (temporal). P/F-s: The opera is_long 3 hours = The opera lasts 3 hours. Thus, there is no need to create special verbs meaning 'to last', 'to weigh', 'to have a volume of' and so on. We simply need to focus the appropriate P-s state verbs and provide a specific measurement as the focus argument. Note that English has only a few verbs such as "to weigh" or "to last". It does not have similar equivalents for most of its measure words. For example, we say "He is too tall" - NOT "*He heights too much", or "The rope is too long" - NOT "*The rope lengths too much". The system proposed here allows you to derive verbs for any kind of measurement. So, let's define a few roots and derive the corresponding measure verbs: -sawa- -> scalar state 'long (temporal)' sawamasi -> P/F-s 'to last F' -hayu- -> scalar state 'heavy' hayumasi -> P/F-s 'to weigh F' -lenga- -> scalar state 'long (spatial)' lengamasi -> P/F-s 'to be F in length/height/depth' [By default, scalar states are P-s. Thus, the P/F-s derivations require the "-ma-" classifier.] The quality CCM "-ve-" can be used to derive the corresponding words meaning 'weight', 'length/height', and 'duration'. For example, the word "hayumaveda" means 'weight'. [Note that we used the P/F-s form for 'weight' because it implies a specific value. The unfocused P-s form, "hayuveda" would be used to represent the meaning 'heaviness'.] Note that measurement nouns such as "weight", "age", "length", and so on can ALSO be obtained via middle voice derivations of the corresponding verbs. For example, the English noun "weight" is also the F-s [-P] noun derivation of the verb "to weigh"; i.e. "hayumadeda". [I leave it as an exercise for the reader to distinguish between the two senses. Here's a hint: the middle derivation is inherently definite, while the quality derivation is inherently indefinite.] Earlier, in the section on abstract nouns, we created the measurement classifier "-ta-". Let's use it now to derive a few measurements: de-ta-da -> 'day' me-ta-da -> 'meter' pondu-ta-da -> 'pound' [Note that, since these are basic nouns, the root is used for its mnemonic value, which means it can be used for its sound value. It doesn't even have to use defined morphemes.] More precisely, the noun forms have the meaning "an entity of one X in measure", as in: detada - "a day", "an entity that is one day long" e.g. He spent the detada with his mother. = He spent the day with his mother. metada - "an entity that is one meter long/high/deep" e.g. He cut the metada. = He cut the meter-long item/object/thing. The verb forms are inherently P-s and have the meaning 'to be one X in measure': detasi -> 'to last one day' metasi -> 'to be one meter in length/height/depth' pondutasi -> 'to weigh one pound' Thus, the adjective forms of the measure words would have the meaning "being one X in measure", as in: pondutano "being one pound in weight" e.g. He lifted the pondutano statue. = He lifted the one-pound statue. metano "being one meter in length/height/depth" e.g. He cut the metano rope. = He cut the meter-long rope. Note that when count adjectives modify measure NOUNS, they refer to N distinct entities. They do NOT refer to an entity that has a measure of N X's. Thus: He cut the ketsano metada. = He cut the eight meter-long items/objects/things. and NOT: He cut the eight-meter-long item/object/thing. In other words, there are eight items, and each item is one-meter long. If we need to multiply the measure itself rather than the number of entities, we simply append the appropriate numeric morphemes. The result will modify the MAGNITUDE, rather than the quantity. Here are some examples (using English word order): -kio- 'one thousand' = 'kilo-' metakiosi 'to be one kilometer in dimension' We can also use exponents: -jinta- exponent -minsu- minus -zi- three -jintazi- kilo- -jintaminsuzi- milli- Note that "-kio-" is an alternative to "-jintazi-". In fact, the negative exponent is so useful, that I have allocated a separate morpheme for it in the sample language: -hu- negative exponent (equivalent to -jinta-minsu-) -du- two -hudu- centi- -zi- three -huzi- milli- -bua- six -hubua- micro- metahudusi to be one centimeter in length metahuzisi to be one millimeter in length metahubuasi to be one micrometer in length And so on. This approach can be used to create counterparts to English morphemes such as "bi-", "tri-", "penta-", "centi-", "sesqui-", etc. Here are some more examples: He cut the metakauda. = He cut the four-meter-long item/object/thing. He spent the detakauda with his mother. = He spent the four-day period with his mother. He spent duno detakauda with his mother. = He spent two four-day periods with his mother. However, I'm not sure that this is a good approach. In general, words should be created only if they are useful in their own right and have long-term value. The above examples are just temporary, on-the-fly creations. Still, this type of construction seems to be universal among natural languages (perhaps because numbers are, by nature, temporary, on-the-fly constructions). Alternatively, if you're a purist, you could paraphrase: His visit sawamasi kauno detada. = His visit lasted four days. or use relative clauses: He spent the period which sawamasi kauno detada with his mother. OR He spent the period which detasi kaulape with his mother. = He spent the period which was four days long with his mother. He cut the thing which lengamasi kauno metada. OR He cut the thing which metasi kaulape. = He cut the thing which was four meters long. As can be seen from the above examples, the "0" adverbial form of a count word (classifier "-la-") has the meaning "N-times". Note that the alternative approach is useful in its own right whenever we want to use the verb form of a measure along with a count. In this case, we use a verb for the measure and the "0" adverb for the count. Thus: The book pondutasi kaulape. = The book 'weighs one pound' 'four-times'. = The book weighs four pounds. English speakers, however, will probably be more comfortable using the equivalent of the English verb 'to weigh': The book hayumasi kauno pondutada. = The book weighs four pounds. It is important to note that we can NOT use the P-s adverb form - we HAD to use the "0" form. The reason is that the P-s form will imply a link to the patient of the verb, thus indicating quantity. Even so, the P-s adverb form is still very useful: The book pondutasi kaupe. = The book 'weighs one pound' 'being four in number'. = All together, the four books weigh one pound. = The four books weigh one pound by themselves. [If you have difficulty with the above translation, try it using the number 'one' instead of 'four'. The resulting translation will contain an expression such as 'singly' or 'by itself'.] The "0" form, however, directly modifies the verb and does not link to any argument of the verb. Thus, we are, in effect, indicating the 'quantity' of the verb; i.e. the frequency of the event. [This is an important distinction that will come in handy again later, when we discuss _comparatives_.] The same effect can also be obtained by using the previous-word modifier part- of-speech (terminator = "-di"), since it always modifies the word that it immediately follows. Thus, the last example that used "-la-" could also have been implemented as follows: The book pondutasi kaudi. = The book 'weighs one pound' 'four-times'. = The book weighs four pounds. Keep in mind, though, that the "0" form will probably be more useful, since it is a true adverb and does not have to immediately follow the verb. For example, direct and/or oblique objects can appear between the verb and the adverb. Adverbial "0" forms of the non-specific numerics are also very useful. Here are some examples (for convenience, I've repeated some of the numeric morphemes below): -saksi- all, the whole amount -mai- many, much, a lot, a large amount -xandu- not too many, not too much -pewa- few, little, a small amount -zonja- any, positive non-zero, one or more, greater than zero saksilape = 'always', 'all the time' mailape = 'often/frequently/a lot' xandulape = 'sometimes/occasionally' pewalape = 'rarely' zonjalape = 'ever (in questions)', 'at least once' Also, from the specific numerics, we get: zeyolape = 'never' felape = 'once' dulape = 'twice' zilape = 'three times/thrice' The ordinal derivations are also useful: fexungalape = 'for the first time' duxungalape = 'for the second time' e.g. "Yesterday, he went to Boston duxungalape" = "Yesterday, he went to Boston for the second time." And so on. Again, the shorter "-di" forms can also be used, but they must always immediately follow the verb. We can also handle noun phrases that contains both counts and measures. For these, we have several options: The open adjective "mabie": I bought gono pondutada mabie rice. or I bought rice mabie gono pondutada. = I bought nine pounds of rice. A relative clause: I bought rice that hayumasi gono pondutada. or I bought rice that pondutasi golape. = I bought rice that weighed nine pounds. An open noun version (terminator = "-giu") of the P/F-s measure verb: I bought gono pondutagiu rice. = I bought nine pounds-of rice. A numeric multiplier and the adjective form of the measure: I bought pondutagono rice. = I bought rice being-nine-pounds-in-weight. A previous-word modifier: I bought pondutano godi rice. = I bought rice being-one-pound-in-weight-times-nine. Agentive versions of the measure verbs are also useful. Here are some examples: P/F-s: The speech sawamasi 25 minutes. = The speech lasts 25 minutes. A/P/F-s: He sawatuesi the speech 25 minutes. = He makes the speech last 25 minutes. A/P/F-d: He sawakosi the speech 25 minutes. = He lengthened the speech by 25 minutes. [Remember, the focus of "-s" scalar states elaborates the ACTUAL magnitude, while for "-d" and "-p" verbs, it elaborates the CHANGE in magnitude.] If we want to create versions of the English verbs such as "to time" as in "He timed the performance" or "to weigh" as in "He weighed the rice", then we need to create activity versions of the basic state verbs. In the sample language, we will accomplish this with the special CCM "-vie-". When added to a state root, it will convert the root to one with the meaning 'to measure or determine the state of' with a class of AP/F-d. Here are some examples: AP/F-d: sawaviesi = 'to time', 'to measure/determine the duration of' hayuviesi = 'to weigh', 'to measure/determine the weight of' lengaviesi = 'to measure/determine the length/height/depth of' The generic derivations are also useful. For example, the AP/F-d generic action verb "viesi" means 'to measure'. The AP-s verb "viepanjisi" means 'to do the measuring'. The AP/F-s verb "viefisi" means 'to take measurements of'. And so on. Finally, do not confuse measure words with specific entities that have precise measures, such as the named time periods "September", "Tuesday", and "1994". These are proper nouns and we'll discuss how to deal with them later. 10.3 OTHER NUMERIC DERIVATIONS It would also be useful to have a separate numeric morpheme to indicate the concept 'N at a time' or 'N per group'. For example, if we allocate "-kawa-", for this morpheme, we would get the following: -kawa- N at a time, N per group, in groups of N fekawalape = one at a time dukawalape = two at a time duvastukawalape = twenty-seven at a time saksikawalape = all at once, all at the same time, all together vastukawano guasuda = a group of seven ducks And so on. Note how, when we directly modify a verb, we get the 'N at a time' meaning, and when we modify a countable noun, we get the 'in groups of N' meaning. [Do not confuse "-kawa-" with the 'group' CCM "-senje-" that we discussed earlier. The 'group' CCM changes the inherent nature of the noun, while "-kawa-" simply describes the count noun in greater detail. In other words, "-senje-" changes the class of the noun by creating a distinct, single entity, while "-kawa-" simply modifies the noun. Also, "-senje-" implies that the members of the group contribute to the operation or function of the whole, while "-kawa-" does not have this sense at all.] We can also specify the number of groups: vastukawaduno guasuda = two groups of seven ducks Note that the order of the number morphemes is the exact opposite of English. This is because English numbers modify the noun to their right, while MCMs in the sample language modify everything to their left. The A/P-d verbal derivations are also useful: John kawapusi guasuda = John assembled/combined/arranged the ducks into groups. John kawafepusi guasuda = John assembled/combined/arranged the ducks into one group. John vastukawapusi guasuda = John assembled/combined/arranged the ducks into groups of seven. John vastukawadupusi guasuda = John assembled/combined/arranged the ducks into two groups of seven. If you're not comfortable with putting so much information into a single word, you can break it up into smaller words (remember, all numeric morphemes are P-s by default): P-s: kawada = a group kawada mabie vastuda = a group of seven kawada mabie vastuno guasuda = a group of seven ducks zino kawada mabie vastuno guasuda = three groups of seven ducks P/F-s: kawamagiu = a group of kawamagiu vastuda = a group of seven kawamagiu vastuno guasuda = a group of seven ducks zino kawamagiu vastuno guasuda = three groups of seven ducks We can also use the unadorned A/P/F-d derivation: kawakosi = to arrange/combine/assemble P into groups of F John kawakosi guasuda vastuda = John arranged the ducks into groups of seven. [Keep in mind that A/P/F verbs are di-transitive and take TWO objects.] We will also need some root/MCMs that will allow us to compare quantities. As it turns out, we already have them - we introduced them in the chapter on polarity. Here they are again: -pi- 'maximally', 'extremely' -ge- 'very', 'highly' -so- 'not too', 'not very' -ju- 'minimally', 'barely', 'hardly' However, in the section on polarity, we used them as UNFOCUSED concepts. When focused, they are actually comparatives. Since this may not be immediately obvious, consider the following: tenci + pi = maximally intelligent = genius tenci + ge = very/highly intelligent = brilliant etc. In the above combinations, we are effectively comparing an entity with an unspecified focus. For example, a 'brilliant' person is highly intelligent compared to the average person. Now, when a morpheme such as "-ge-" follows a root morpheme meaning 'X', we obtain the sense 'more than X'. However, what we need here is the inverse concept 'X more than'. So, to obtain this sense, we will simply adopt the convention that when something FOLLOWS "-ge-", it will have the sense of 'X more than'. We will use the same approach for the other scalar polarity morphemes. Here are some examples: gefeno guasuda = one more duck geduno guasuda = two more ducks sozino guasuda = three fewer ducks Also, when "-ge-" is attached to an adjectival concept, it is equivalent to the English suffix "-er", while "-pi-" is equivalent to English "-est". For example, "tencisepino = tencipino" means 'smartest' or 'most intelligent', while "teyomapisi = teyopisi" means 'to be most knowledgeable about/in'. [Note the distinction between "tencisepino=tencipino" = 'most smart' and "tencipiseno" = 'genius'. Keep in mind that when the modifier directly modifies the root, it modifies the INHERENT state (i.e. 'genius'). When it follows the verb classifier, it modifies the current or local state (i.e. 'most intelligent'), and the actual degree will depend on the context.] Now, the 'N more/less' interpretation only makes sense if the noun is a count noun. If it is a mass noun, or if a number is not specified, then the interpretation is simple 'more' or 'less/fewer'. Thus, geno guafada = more water sono guafada = less water geno guasuda = more ducks sono guasuda = fewer ducks The "0" verb forms are also useful: gefelape = again, one more time gedulape = two more times, twice more gezeyolape = zero more times, not...again, never...again, not...anymore gemailape = many more times, much more gepewalape = a few more times sodulape = two fewer times gelape = some more, as in "He did it some more" solape = less, as in "He sleeps less now" And so on. Note that "-ge-fe-" has the same meaning as the English prefix "re-". Here is an example: P/F-d: teyodosi = 'to learn' teyodogefesi = 'to re-learn' However, there is a potential problem with the above derivation. In the sample language, when an MCM is added to a stem, it immediately applies to the entire stem. Thus, the above derivation proceeds as follows: teyodosi = 'to learn' teyodogesi = 'to learn a lot' teyodogefesi = 'to learn a lot once' Obviously, the "correct" derivation is not going to be very useful, and, as we saw earlier, we already have derivations for words like "once", "twice", etc. We can, of course, create a single morpheme to represent "ge + fe", but this is really not necessary. In the sample language, we will simply adopt the rule that whenever a numeric morpheme immediately precedes and/or follows a scalar polarity morpheme, they will be considered as a unit. Thus, with this rule, the above example is correctly parsed as: teyodo(gefe)si = 'to learn one more time' = 'to re-learn' Here are some more examples: AP/F-s: zefisi = 'to do (something)' zefigefesi = 'to re-do (something)', 'to repeat (something)' AP/F-d: mesuasi = 'to reach', 'to arrive at', 'to come to' mesuagefesi = 'to return to' For relative ordinals, we can use the morphemes "-ge-" or "-so-" plus the ordinal marker "-xunga-". If a numeric morpheme is not provided, the default value will be 'one'. Here are some examples: gexungano 'next' = 'the one-more-th' soxungano 'last/previous' = 'the one-less-th' geduxungano 'next plus one', 'the one after the next one' soduxungano 'the one before the last' Now, note that N follows the polarity root/MCM in the above examples. It is, in effect, the patient of an inverse, unfocused relationship. If we place a number AFTER the polarity root/MCM, it will represent the focus; i.e., it will represent the concept 'more than N' or 'less than N'. Here are some examples: dugeno guasuda = more than two ducks zisono guasuda = less than three ducks dugelape = more than twice zisolape = less than three times Note that the numeric root/MCM "-zonja-" that we derived earlier and meaning 'any' or 'more than zero' is actually a useful abbreviation for "-zeyo-ge-". Should we ever need an explicit plural marker, we can use "-fe-ge-", meaning 'more than one'. The singular, of course, is simply "-fe-". We can also implement the concept of 'same amount/quantity' by using a morpheme that we introduced earlier: -kapsu- same, equal When it follows the numeric morpheme "-zonja-" (meaning 'any'), it will have the meaning 'same amount or quantity'. Here are some examples: zonjakapsuno guasuda = just as many ducks, the same number of ducks zonjakapsuno guafada = just as much water, the same amount of water zonjakapsulape = just as often, just as many times, the same number of times When used with a specific number, it emphasizes the exact value: buakapsuno guasuda = exactly/precisely six ducks buakapsulape = exactly six times And so on. The morpheme "-lo-" which we introduced earlier (meaning 'like' or 'similar'), represents the concept 'about' or 'approximately' when used in numeric contexts. Here is an example: -lo- like, similar, about, approximately golono guasuda = approximately nine ducks Finally, keep in mind that numeric morphemes are both roots and MCMs and, as MCMs, they can appear after a classifier. It is not necessary to create distinct numeric words. Here are some examples: guasugoda = nine ducks guasugogeda = more than nine ducks guasugegoda = nine more ducks guasugokapsuda = exactly nine ducks Later, we'll see how the scalar polarity morphemes can be put to even greater use, when we discuss _comparatives_. 11.0 DEIXIS A _deictic_ word is one whose referent is determined by the speech context. For example, in the sentence "I ate here yesterday", there are three deictic words: 1. "I" - The actual referent depends on WHO uttered the sentence. 2. "here" - The actual location depends on WHERE the sentence was uttered. 3. "yesterday" - The actual time depends on WHEN the sentence was uttered. Deictics are inherently unfocusable - NOT because there is no referent - but because the referent can never be stated explicitly. It is always determined by the speech situation. What's especially fascinating about deictics is the strong relationship between their forms and their meanings in many natural languages, as well as the strong relationship between the meanings of deictics that, on the surface, appear to be completely unrelated. For example, most natural languages have a three-way distinction between personal pronouns, deictic locatives, and demonstratives: 1st person: I/we here this/these 2nd person: you there that/those 3rd person: he/she/it/they yonder yon Standard English rarely uses "yon" and "yonder" anymore, but it used to be used quite often. Also, languages that make the three-way distinctions for locatives and demonstratives generally do it in the following way: this or here -> at or near the speaker that or there -> at or near the addressee yon or yonder -> far from both speaker and addressee Note that 1st person is the speaker, 2nd person is the addressee, and 3rd person is neither the speaker nor the addressee. For example, Japanese is fairly typical of how many languages use the same forms for both demonstratives and locatives: near speaker near addressee far from both ------------ -------------- ------------- adjective this - "kono" that - "sono" yon - "ano" pronoun this - "kore" that - "sore" yon thing - "are" locative here - "koko" there - "soko" yonder - "asoko" While not perfectly regular in the modern language, they all evolved from the same roots. English also has a historical link between "this/here", "that/ there", and "yon/yonder", although it is less regular. An even better example, though, is Cambodian where the word "nih" means either 'this' or 'here', and the word "nuh" means either 'that' or 'there'. And in Turkish, the same root is used to derive the third person pronouns meaning 'he/she/it/they', the demonstrative meaning 'that', AND the locative meaning 'there'. As it turns out, this correlation between form and meaning, and the obvious link to 1st, 2nd, and 3rd person referents is quite common among the world's languages, and I suggest we take advantage of it. Another major difference between deictics and other words is that deictics do not indicate, in any way at all, the nature of their referents. For example, on hearing the noun "duck", we immediately know a lot about the referent. However, the pronouns "you" or "that" or the adjectives "his" or "this" or the locatives "there" or "yonder" tell us nothing about their referents. Instead, they simply 'point to' or 'index' the actual referent. In computer terms, we can think of a deictic as an index into an array of many potential referents. In effect, a deictic used as a noun is not a true noun, a deictic used as an adjective is not a true adjective, and so on, just as the index into an array is not in the same class as the element it points to. Deictics are also different from words such as nouns, verbs, etc. because there are very few of them, and because new ones rarely enter a language. For example, new nouns are adopted by a language quite often, while deictics are the result of slow and gradual language evolution that can take centuries. [Incidentally, since the referents of deictic expressions are effectively 'indexed' by the location of the speaker and the addressee, deictics are also sometimes called _indexicals_, and deixis (i.e. the phenomenom itself) is sometimes referred to as _indexicality_. Also, words that are members of small, closed groups, such as deictics, are called _closed class_ words, while words that are members of large, open groups, such as nouns and verbs, are called _open class_ words.] In the next few sections, I will propose a highly regular system that can be used to implement personal pronouns, possessive adjectives, possessive pronouns, demonstratives, and deictic locative and temporal words. 11.1 PERSONAL PRONOUNS, POSSESSIVE ADJECTIVES, AND POSSESSIVE PRONOUNS In the sample language, I will implement deictics by allocating a set of root morphemes that are MNEMONICALLY compositional. In other words, deictics will be formed from true, unique root morphemes, but we will design them in a way that will display their inherent compositionality. For personal pronouns and possessives, the basic components will be as follows: 1: mi- 2: du- Sing: -a- 3: se- 1+2: ci- plus -st- plus Plur: -i- 1+3: be- 2+3: fa- Unspec: -u- 1+2+3: po- As usual, the terminator will indicate the part-of-speech. Here are the forms that correspond to the English personal pronouns and possessive adjectives (note the use of the genitive CCM "-xa-" for the possessives): I = mistuda my = mistuxano mine = mistuxada 1 you = dustuda your = dustuxano yours = dustuxada 2 it = sestada its = sestaxano its = sestaxada 3, singular we = postuda our = postuxano ours = postuxada 1+2+3 they = sestida their = sestixano theirs = sestixada 3, plural Note that I did not mark number for "I" and "you", since the actual number is always obvious from context. This is certainly necessary to emulate English "you", since it can be either singular or plural. For English "I", the singular form "mistada" could be used to represent the concept 'I and I alone'. Similarly, the singular you form "dustada" would mean 'you and you alone' and the plural form "dustida" would mean 'you all'. I also did not mark number for "we". Obviously, the 1+2, 1+3, and 2+3 forms are inherently plural. However, by using the unspecified form, we can indicate that the number of each COMPONENT is not being specified. If we use the singular or plural form, I suggest that it indicate the quantity of the 3rd person component, if present, or the 2nd person component if the 3rd person is not present. The 1st person component is always whoever is speaking, thus there is never a need to specify its number since it is always obvious (even if more than one person is speaking at the same time). With this rule, we can create the four first person plural pronouns of a language such as Hawaiian: kaaua = cistada, 1+2 singular, the speaker(s) and the person addressed maaua = bestada, 1+3 singular, the speaker(s) and one other person, but NOT the addressee(s) kaakou = cistida, 1+2 plural, the speaker(s) and two or more persons addressed maakou = bestida, 1+3 plural, the speaker(s) and two or more other people, but NOT the addressee(s) Quite a large number of languages have two 1st person plural pronouns. For example, in Indonesian, "kita" has the same coverage as English "we". The second pronoun, however, explicitly EXCLUDES the addressee: kami = bestuda 1+3 unspecified, speaker plus one or more others who are not present, but NOT the addressee(s) Pronouns which include the addressee(s) are called _inclusive_, while pronouns which exclude the addressee(s) are called _exclusive_. Similar derivations can be done to create counterparts of pronouns in other languages. For example, many languages (e.g. Italian, Hungarian) make a singular/plural distinction in their 2nd person pronouns: dustada = you (singular) dustida = you (plural) [Some languages use the 2nd person plural form with singular referents to indicate politeness. We'll discuss how to do this later.] We can make the distinction between 'he', 'she', and 'it' by using a gender MCM (discussed earlier). For example, if the MCM for 'female' is "-gaya-", then the word for English 'she' would be "gayasestada", the word for Spanish 'nosotras' (1+2+3, feminine, singular or plural) would be "gayapostuda", and so on. The 1st person plural form "mistida" can be used by a speaker to emphasize that he is speaking for himself as well as for others who are also present, even though the others are not speaking at the same time. Or, it could be used by someone who considers himself as inherently plural (eg. the royal "we"). Some languages (e.g. Cambodian and several languages of New Guinea) even have versions of 3rd person pronouns that are unspecified for number, as well as 2+3 forms. The system proposed here allows us to create any of these pronouns with total regularity and with whatever degree of precision (or lack of precision) that we need. Some languages have dual (= exactly 2), trial (= exactly 3), and paucal (= a few) forms of their personal pronouns. However, I have (somewhat arbitrarily) provided forms only for 'singular', 'plural', and 'unspecified' number. It would certainly be possible, of course, to provide additional forms for the other numbers. However, these forms are not as common among natural languages, and I don't feel that unique forms are really necessary. Instead, we can create equivalent words by using a specific numeric morpheme as a root, and modifying it with an appropriate deictic morpheme. Here are a few examples: Dual: dupostuda = 'both of us' Trial: zidustuda = 'the three of you', 'you three' Paucal: pewasestuda = 'the few of them' In the above derivations, it was necessary to place the numeric morpheme first. In the sample language, morphemes further refine whatever appears to their left, potentially reducing the number of referents. Thus, if the numeric morpheme had followed the deictic morpheme, it would indicate a possible subset of a larger group. For example, "dupostuda" means 'both of us' where "us" refers to exactly two people, while "postududa" means 'two of us' where "us" refers to more than two people. Verb forms could, by default, be P-s verbs having meanings 'to be X'. The verb "mistusi" would mean 'to be me' or 'I am' (e.g. "Bad carpenter mistusi" = 'I am a bad carpenter') and "mistuxasi" would mean 'to be mine' (e.g. "The pencil mistuxasi" = 'The pencil is mine'). Adjectival forms could be used to handle expressions such as "You boys" in "You boys better behave yourselves", where "You" would be "dustuno" or "dustino" and would modify the noun "boys". Adverbial forms will probably not be very useful, but open modifier forms would be treated the same as basic nouns. Use of verb classifiers could create such words as P-d "mistuxapiasi" = 'to become mine', A/P-d "mistuxapusi" = 'to make mine', etc. 11.2 DEMONSTRATIVES For demonstratives, the basic components will be as follows: 1: mi- 2: du- Sing: -a- 3: se- 1+2: ci- plus -mp- plus Plur: -i- 1+3: be- 2+3: fa- Unspec: -u- 1+2+3: po- As usual, the terminator will indicate the part-of-speech. Here are the English equivalents: this = mimpada or mimpano 1, singular these = mimpida or mimpino 1, plural that = dumpada or dumpano 2, singluar those = dumpida or dumpino 2, plural yon = sempuda or sempuno 3, unspecified If you do not want to make the 'that/yon' distinction, use the 2+3 forms: that = fampada or fampano 2+3, singular those = fampida or fampino 2+3, plural Again, using Indonesian as an example, we can create forms that do not specify number: ini = mimpuda or mimpuno 1, unspecified ('this/these') itu = fampuda or fampuno 2+3, unspecified ('that/those') Some languages have other versions. For example, 1+2 demonstratives are found in Sre (Vietnam) and Chibemba (Africa). The basic verb forms can represent P-s concepts such as 'this is' and 'those are'. For example, the P-s verb "mimpasi = mimpasesi" would mean 'this is' in a sentence such as "This is John Smith" or "This is a papaya". Verb classifiers can also be useful ('to become this entity', 'to make something into that entity', etc.). For example, the A/P-d version of the 3rd person unspecified demonstrative, "sempupusi", would be used to represent "to turn ... into that" in a sentence such as "I TURNED the scrap lumber INTO THAT". [Incidentally, note how "THAT" in the above example can have either a singular or a plural sense, which is why the unspecified form was used.] 11.3 LOCATIVE DEICTICS For locative deictics, the basic components will be as follows: 1: mi- 2: du- Sing: -a- 3: se- 1+2: ci- plus -ng- plus Plur: -i- 1+3: be- 2+3: fa- Unspec: -u- 1+2+3: po- As usual, the terminator will indicate the part-of-speech. Here are the English equivalents: here = mingupe or minguda 1, unspecified number there = dungupe or dunguda 2, unspecified number yonder = sengupe or senguda 3, unspecified number Note that the English words "here/there/yonder" can cover either a single point or a wide area, making them inherently 'unspecified number'. We can use the singular form to get the sense of 'at this spot', while the plural form can be used to get the sense 'hereabout'. If you do not want to make the 'that/yon' distinction, use the 2+3 forms: there = fangupe or fanguda 2+3, unspecified However, the 3rd person form "sengupe" is still useful since it represents the English adverb "over there". As was mentioned above, the singular and plural forms are also useful: mingape 'at this (particular) spot' dungape 'at that (particular) spot' sengape 'at that (particular) spot over there' mingipe 'hereabout' dungipe 'thereabout' sengipe 'those places over there' Keep in mind that deictic locatives ending in "-pe" are adverbs, while those ending in "-da" are nouns. Note the difference in the following examples: mingupe: I met him mingupe yesterday = I met him here yesterday. dungupe: I put the book dungupe = I put the book there. dunguda: I put the book in dunguda = I put the book in there. The basic verb forms can represent such concepts as 'here is', 'there are', etc. For example, the P-s verb "mingusi" would mean 'here is' in a sentence such as "Here's Bill" or "Here are the books you wanted". However, English speakers should be careful not to confuse the 2nd + 3rd or 3rd person deictic constructions with the P-s verb "veyasesi", discussed earlier, which does not refer to a particular location. Consider the following: dungusi: There are the books you wanted. veyasesi: There are people who actually like him. Adjective forms are also useful, as the following examples illustrate: sengupe: I saw Sally over there (= I was over there when I saw her). senguno: I saw Sally over there (= She was over there when I saw her). senguno: The man over there married my sister. In the last two examples, the adjective "senguno" modifies the nouns "Sally" and "man". Verb classifiers can also be useful ('to get here', 'to keep there', 'to put over there' etc.). For example, the A/P-s verb "minguzoyasi" would be used to represent "to keep here" in a sentence such as "We keep the plants here during the winter". Finally, do not confuse deictic locatives with state adverbs such as "near/ nearby", "far/far away/far off", etc. The adverb forms often appear to be used deictically, but this is simply because the contextual referent is sometimes the location of the speaker. There are other times, however, when the referent is NOT the speaker: Referent is the speech location: John lives nearby. (= near here) Referent is NOT the speech location: When I rented that cheap apartment in Boston, John lived nearby. (= near the apartment) Compare the above with "John lives here" vs. "When I rented that cheap apartment in Boston, John lived here". In other words, when using an unfocused version of an inherently focused concept, we must supply a default based on context, and sometimes the default referent will be the speaker's location, but not always. It is important to keep in mind that true deictics are inherently UNFOCUSABLE because the referent is ALWAYS determined by the speech act, and can NEVER be stated explicitly. 11.4 TEMPORAL DEICTICS For temporal deictics, the basic components will be as follows: 1: mi- 2: du- Sing: -a- 3: se- 1+2: ci- plus -lk- plus Plur: -i- 1+3: be- 2+3: fa- Unspec: -u- 1+2+3: po- As usual, the terminator will indicate the part-of-speech. I will also adopt the following person/time mappings: 1: present 2: past 3: future 1+2: past, same time unit 1+3: future, same time unit 2+3: (unassigned) 1+2+3: (unassigned) Here are some English equivalents: now = milkupe 1, unspecified earlier = dulkupe 2, unspecified later = selkupe 3, unspecified right now, at this moment = milkape 1, singular currently, nowadays = milkipe 1, plural Note that the above derivations are true deictics. Thus, they cannot be used in a sentence such as "John arrived at 3, but Bill arrived much earlier". Since "earlier" in the example is not relative to the moment of speech, it is not a true deictic. It is simply a temporal state relationship whose referent must be determined from context. (In fact, we derived this word when we discussed temporal case tags. The word is "lundasepe", meaning 'earlier', 'previously', or 'already'.) Languages also have deictics that refer to specific durations, such as 'today', 'tomorrow', and 'yesterday'. For these, I suggest that we simply convert the appropriate measure word to a temporal deictic by suffixing the deictic root to the measure root. In effect, we are using the deictic root as an MCM and forcing the measure to become deictic. Here are some examples: today = deta-milkupe 'day' + present yesterday = deta-dulkupe 'day' + past tomorrow = deta-selkupe 'day' + future earlier today = deta-cilkupe 'day' + past, same time unit later today = deta-belkupe 'day' + future, same time unit If we use numeric multipliers, we can indicate precise temporal distances from the present time. Here are some examples: day before yesterday = detadu-dulkupe '2 days' + 'earlier' day after tomorrow = detadu-selkupe '2 days' + 'later' 3 days ago = detazi-dulkupe '3 days' + 'earlier' 7 days from now = detavastu-selkupe '7 days' + 'later' The speaker also has the option of using a temporal case tag with an appropriate argument such as in "I saw Bill AT three days before today". 12.0 REDUCING WORD LENGTH A powerful derivational morphology can produce a very large number of words from a very small number of morphemes. At times, though, the resulting words can be much longer than their counterparts in English. People who dislike overly long words, especially when the words represent commonly used concepts, may wish to have ways to shorten them. In the next three sections, I will discuss three ways that these goals can be accomplished in the sample language. 12.1 MACROS As it turns out, we already have one solution to this problem. On several occasions, we created abbreviations for commonly used combinations of existing morphemes. Here are a few of them: -voi- 'double middle' = "-de-de" -kua- 'double anti-passive' = "-ga-ga-" -fevde- '1/X' = "-fe-divde-" ('one over X') -zonja- 'any/some' = "-zeyo-ge-" ('more than zero') I will refer to these abbreviations as _macros_. The dictionary entry for a macro does not need to contain a complete, normal, dictionary definition. It only needs to provide the full, non-abbreviated form. For example, the dictionary entry for "-zonja-" could be listed as simply "fevde = zeyo + ge". Note that macros are fully compatible with a self-segregating morphology. 12.2 PARTICLES To create shorter versions of complete words, we can allocate a distinct part- of-speech terminator. We will call these new words _particles_ and allocate the terminator "-ka" for them. Particles are not semantically compositional, but anything appearing before the terminator can be used for its mnemonic value. For example, if the language designer considered the word "detamilkupe" (meaning 'today') to be too long, he could allocate the particle "deka", or "detaka", or even "demika" to represent it, where "de-", "deta-", and "demi-" would be used for their mnemonic values. [Note that I am not actually making this assignment in the sample language, since I do not feel that it is necessary. I am simply listing it as a possibility. Later, I will discuss particle assignments that are better motivated.] In effect, a particle is simply an abbreviation for a complete word. Note that particles, unlike macros, can NOT appear within another word - particles always represent complete, stand-alone words. Since a particle is not semantically compositional, you cannot determine its meaning from the meaning of its parts, nor does the terminator indicate its part-of-speech. Instead, a dictionary look-up will be required to determine both its meaning and its part-of-speech. And like macros, if a particle represents a single longer word, the dictionary entry does not need to contain a complete, normal, dictionary definition. It only needs to provide the full, non-abbreviated word. Using the above example, the dictionary entry for "deka" could be listed as simply "deka = detamilkupe". Finally, note that since particles are unambiguously terminated, they are fully compatible with a self-segregating morphology. 12.3 THE LOW LANGUAGE Human beings do not require a self-segregating morphology in order to understand spoken language. The combination of prosodic features and context is more than sufficient to allow a human listener to parse speech into its component morphemes and words. For speech processing by computers, however, the problem is much more difficult. It is not currently possible to program computers to parse human speech unambiguously, and it is not likely to become possible in the foreseeable future. When and if it eventually does become possible, it will require expensive, custom designs for each language. The reason for the difficulty is that natural languages are not self- segregating. If they were, then speech processing would be trivially easy. In the sample language, we have achieved self-segregation by strictly and unambiguously defining the forms of morphemes and by requiring the use of terminators to terminate each word. The terminators play the additional role of marking the part-of-speech of the word, which can help considerably in syntactic parsing. However, when speaking to humans, there is no need to terminate ALL words if each stem has a default part-of-speech. For example, for basic nouns, the default part-of-speech would be NOUN; for stems that contain verb classifiers, the default part-of-speech would be VERB; for roots that are P-s by default, the default would be ADJECTIVE unless a verb classifier was used; for verbs formed from temporal or locative roots, the default would be ADVERB or CASE TAG; and so on. When terminators are used with ALL words, we will refer to it as the _high_ language. When terminators are dropped and defaults are assumed, we will refer to it as the _low_ language. Here are some examples: High Low Meaning ---- --- ------- teyomisi teyomi to study guafada guafa water guafapusi guafapu to water, add water to vastuno vastu seven menadope menado locative 'from' Not that, when a verb classifier follows a noun classifier the word becomes a verb by default. Note also that when locative concepts are used with verb classifiers, they remain adverbs or case tags by default. Finally, it is important to emphasize that the low language can never be used when speaking to computers, since computers will not be able to determine word boundaries. The low language can only be used when speaking to people. 13.0 ARTICLES Articles are used to indicate whether or not a noun has a specific referent. The definite article implies a specific referent, and corresponds to the English word "the". The indefinite article indicates a non-specific referent and corresponds to the English words "a/an". For example, in the sentence "John needs A pencil", the listener does not assume that John needs a particular pencil - any pencil will do. However, in the sentence "John needs THE pencil", the listener assumes that the speaker is referring to a specific pencil. Not all languages have unique words to represent articles (e.g. Chinese). When articles are not available in a language, word order (e.g. Russian) or verb-marking (e.g. Swahili) can sometimes distinguish between definiteness and indefiniteness. More often, though, the number 'one' is used to represent the English words "a/an" (e.g. French), and a word meaning 'that' is used to represent English "the" (e.g. Indonesian). In fact, in most (and perhaps all) languages that have them, articles are simply phonologically reduced or 'degenerate' forms of the same words meaning 'one' and 'that'. For example, the English word "the" derives from the old English word for 'that', and the words "a/an" derive from the old English word meaning 'one'. In light of the above, we can implement articles in either of two ways: 1. Use the word for 'one' for the indefinite article, and the word for demonstrative 'that' for the definite article. 2. Create phonologically reduced forms of the words for 'one' and 'that'. Now, phonologically reduced forms are really not compatible with the system we are using (nor are they compatible with my own personal tastes). However, we can do something very similar by instead allocating particles for the definite and indefinite articles. Here are possible particle versions of the 2nd+3rd person demonstrative (unspecified number) 'that' and the number 'one': 'that' = "fampuno" -> 'the' = "puka" 'one' = "feno" -> 'a/an' = "feka" We'll see later how a similar technique can be used to create colloquial expressions, insults, and even foul language. Keep in mind that a particle is not semantically compositional. Its meaning must always be determined by a dictionary look-up. However, the components of a particle may be used for their mnemonic value. Another possibility would be to use the 1+2+3 demonstratives for the definite articles. Cambodian does something very similar to this. It has a word that can mean either 'this/these' or 'that/those', and corresponds exactly to the word "pompuno" in the sample language. It is normally translated as 'the'. In the sample language, we will NOT assign particles for articles. Instead, we will use "pompuno" to emphasize definiteness and "feno" to emphasize indefiniteness. 14.0 COMPARATIVES Unlike basic verbs, comparatives do not represent true states or actions. Instead, they indicate the RELATIVE magnitudes of two or more states or the RELATIVE quantities of two or more entities. In a sense, they are somewhat like deictics, since they do not represent exact states or entities. Unlike deictics, though, they do not index or point to exact states or entities. Instead, they simply position one referent with respect to another along a one-dimensional scale: John John John John John is is is is is least less happy more most happy happy happy happy | | | | | V V V V V o-----------o-----------o-----------o-----------o Absolute Absolute Minimum Maximum Now, the interpretation of comparatives will depend on the nature of what is being compared. Earlier, when we discussed counts and measures, we made an important distinction between counts which were explicitly linked to an argument of the verb, and those which were not linked, but which modified the verb directly. Thus, a P-s adverb had the meaning 'being N in quantity' when it linked to a noun, while the verb-modifying "0" form had the meaning 'being N in frequency'. Comparatives behave in the same way. However, count words have specific numeric values, whereas comparatives have the very vague meaning of 'relative magnitude'. Thus, when a count modifies a verb, it can only indicate a frequency; i.e. a number or a count of discrete events. A comparative, however, can be interpreted as either degree, duration, or frequency. Consider the following examples: degree -> Fish stinks more than beef. duration -> John studied more than Bill. frequency -> He complained more than I did. Note, though, that these are the most likely interpretations in English, and can change depending on context. Also, when necessary, it is possible to explicitly indicate the desired interpretation: degree -> Fish stinks stronger than beef. duration -> Fish stinks longer than beef. frequency -> Fish stinks more often than beef. degree -> John studied harder than Bill. duration -> John studied longer than Bill. frequency -> John studied more frequently than Bill. degree -> He complained more vehemently than I did. duration -> He complained longer than I did. frequency -> He complained more often than I did. And yet, when you look more closely, the most likely "more than" default interpretation actually includes all three non-default interpretations. For example, the sentence "John studied more than Bill" could be interpreted as "John studied harder, longer, and/or more frequently than Bill". In other words, when a "more than" comparative is used with verbs, it can indicate any or all of the three concepts of 'degree', 'duration', or 'frequency'. However, the nature of the verb and the context in which it is used may favor one interpretation more than another. The nature of the verb probably has the strongest effect on the most likely interpretation. Here are some examples: Degree -> all non-agentive "-s" and "-d" verbs: P-s to suffer John suffered more than Bill. P/F-s to love John loves Marie more than Bill. P-d to heal John healed more than Bill. P/F-d to remember John remembered his dad more than Bill. Duration -> agentive "-s" verbs: AP-s to jog John jogged more than Bill. A/P-s to hold John held the baby more than Bill. AP/F-s to read John read more than Bill. Frequency -> agentive "-d" verbs and all "-p" verbs: A/P-d to break John broke dishes more than Bill. AP/F-d to escape John escaped the prison more than Bill. A-p to complain John complained more than Bill. A/P-p to kick at John kicked at the box more than Bill. Keep in mind, though, that the above are the most likely interpretations, but they are not exclusive. For example, the AP-s verb "to jog", as in "John jogged more than Bill", can be interpreted as either duration or frequency, depending on context. Natural languages implement comparative constructions in several different ways. Here are some examples of the major types: 1. The 'from' comparative (e.g. Classical Arabic, Hindi, Japanese, Eskimo, Quechua, Turkish, Burmese) A horse is big FROM a mouse. = A horse is bigger than a mouse. (In these constructions, "from" is the same word or affix used in a sentence such as "He drove FROM Boston to New York".) 2. The 'to' comparative (e.g. Breton, Maasai, not very common) A horse is big TO a mouse. = A horse is bigger than a mouse. (In these constructions, "to" is the same word or affix used in a sentence such as "He drove from Boston TO New York".) 3. The 'more' plus 'on' comparative (e.g. Navaho, Tamil, not very common) A horse is MORE big ON a mouse. = A horse is bigger than a mouse. (In these constructions, "on" is the same word or affix used in a sentence such as "He put the book ON the table".) 4. Comparatives that use opposites or negatives (e.g. Motu, Dakota, Samoan, Nahuatl. This method is very common, but is limited to relatively obscure languages.) A horse is big, a mouse is not big. OR A horse is big, a mouse is small. = A horse is bigger than a mouse. 5. Comparatives formed from verbs meaning 'to be more in degree', 'to be equal in degree', and 'to be less in degree' (e.g. Chinese, Hausa, Swahili, Vietnamese, Yoruba, Cambodian) A horse is big SURPASSING a mouse. = A horse is bigger than a mouse. 6. Comparatives that use special particles (e.g. Hungarian, Russian, Malagasy, English, Basque, Javanese. A large majority of the languages in this group are European.) English: A horse is bigger THAN a mouse. Javanese: A horse is big THAN a mouse. The first three methods are essentially metaphoric or idiosyncratic, and I will say no more about them. The fourth method does not require any special treatment, since it simply juxtaposes simple clauses. The fifth and sixth methods, however, do require separate discussion. And since these last two methods are used by most of the world's major languages, I will discuss each of them in detail. In designing your AL, you can use either or both. 14.1 VERBAL COMPARATIVES Comparatives derived from basic verbs are extremely productive, allowing every possible kind of comparison using regular morphosyntax and straightforward semantics. Thus, it is more versatile than the other methods. But the method has its price - a greater number of forms must be learned by the student. Also, since this method is verb-oriented, it is very easy to implement in the system we are using here. However, this method, although common among the languages of the world, is not used by the European languages. Thus, it may seem somewhat odd to speakers of European languages. As we stated above, comparatives simply position one referent with respect to another along a one-dimensional scale. In other words, comparatives perform a function that is similar to the scalar polarity MCMs we discussed earlier. However, the scalar polarity MCMs do not COMPARE two entities - they simply select a position on a pre-defined scale. Thus, we need something that is inherently comparative. Earlier in this monograph, we discussed how the scalar polarity morphemes are inherently comparative when focused. Here they are again: -ge- more than -kapsu- same amount as, same quantity as, just as much/many as -so- less, fewer than However, these MCMs are P/F-s by default. For comparatives, we will typically use the 0/F class (classifier = "-jo-"), since we may not always want to link to the patient of the sentence. Here are the results in verb form: gejosi to be or do more in degree than kapsujosi to be or do equal in degree as sojosi to be or do less in degree than Note that the paraphrase "be or do" is intentionally vague, which is why we need the 0/F class. To handle the concepts 'most' and 'least', we use the 'maximal' MCM "-pi-" and the 'minimal' MCM "-ju-": pijosi to be or do most in degree among jujosi to be or do least in degree among We can also use other classifiers to create different verbs. For example, the P/F-s "gesi = gemasi" corresponds in meaning to the English verbs "to surpass" or "to exceed". In the following examples, I will illustrate how to use the above five basic verbs and their derivatives in a large number of comparative constructions, using English syntax. For adverbial forms, I will use the "0" verb class we discussed earlier (classifier = "-la-"). Let's start with some simple examples: 1. John is taller than Bill. = John is_tall gejope Bill. [where "is_tall" is a P-s verb] [A good paraphrase for "gejope" is 'to a greater degree than' or 'more-so than'.] 2. John is as tall as Bill. = John is_tall kapsujope Bill. [A good paraphrase for "kapsujope" is 'to a degree equal to' or 'to the same degree as'. If we need the sense of 'about/approximately as tall as', we can use "lojope" instead of "kapsujope", where the verb "losi = lomasi" means 'to be similar to'.] 3. John is less tall than Bill. = John is_tall sojope Bill. [A good paraphrase for "sojope" is 'to a lesser degree than' or 'less-so than'.] 4. John is not as tall as Bill. = John is_tall sojope Bill. 5. John is not the same height as Bill. = John is_tall kapsunajope Bill. ["-na-" = 'not'] [A good paraphrase for "kapsunajope" is 'to a degree not equal to' or 'to a different degree than'.] Compare the semantics and surface forms of 3, 4, and 5. English is confusing here, since "not as tall as" actually means 'shorter than'. The periphrastic form "not the same height as" must be used to obtain a true 'not' sense. Actually, "kapsujope" will probably not be very useful when comparing scalar states, because a focused version of the state verb is more efficient and has the same meaning. For example, number 2 above can be implemented as "John is_tall Bill", where "is_tall" is a P/F-s verb. Here, the focus is simply elaborating the state (cf. "John is_tall 6 feet"). In cases like these, we can also apply the other polarity MCMs directly to the verb. Here's an example: The book hayusi = The book is heavy. ("hayusi" = P-s) The book hayumasi 4 pounds = The book weighs 4 pounds. ("hayumasi" = P/F-s) The book hayumagesi 4 pounds = The book weighs more than 4 pounds. The book hayumagesi the cup = The book is heavier than the cup. In other words, when the patient and focus of a scalar stative verb are being directly compared, we can affix the polarity morpheme to the verb rather than use a separate comparative word. Here are some more examples using external comparative words: 6. John is the tallest of the three brothers. = John is_tall pijope the three brothers. [A good paraphrase for "pijope" is 'to the greatest degree among'.] 7. John is the tallest student in the class. = John is_tall pijope students in the class. 8. John is more quiet than shy. = John is_quiet gejope (he) is_shy. OR = John's quietness gesi (his) shyness. [where "gesi = gemasi" = 'to exceed' or 'to surpass'.] In 8, "gejope" corresponds exactly to the English phrase "more than". And, like the English counterpart, the exact linkage can be ambiguous if the verb has more than one argument: 9. John helps Bill more than Mike. = John helps Bill gejope Mike. In 9, it's not clear whether "gejope" links to "John" or to "Bill". The English "more than" is equally ambiguous. We can, however, resolve the ambiguity by using complete embedded sentences, just as English does: 10a. John helps Bill more than he helps Mike. = John helps Bill gejope he helps Mike. 11a. John helps Bill more than Mike does. = John helps Bill gejope Mike does. In our sample language, we also have the option of using specific AP/F-s or P/F-s forms: 10b. John helps Bill more than he helps Mike. = John helps Bill gepe Mike. ("gepe = gemape" = P/F-s) [Here, Bill experiences a greater degree of 'help' than Mike.] 11b. John helps Bill more than Mike does. = John helps Bill gefipe Mike. ("gefipe" = AP/F-s) [Here, John acts to a greater degree than Mike.] More often though, use of "gejope" is not ambiguous since the context allows only one reasonable interpretation: 12. Kids join gangs in Boston more than in Cowtown. = Kids join gangs in Boston gejope Cowtown. There is no ambiguity here because a comparative MUST compare entities that are COMPARABLE. Since the argument of "gejope" is "Cowtown", it makes no sense to compare it with "kids" or "gangs". The only possible link is to "Boston". 13a. John reads novels more than Bill. = John reads novels gejope Bill. Here, the implication is that John spends more time reading novels than Bill does. If we want to compare the time spent reading novels versus, say, short stories, we would do this: 13b. John reads novels gejope short stories. = John spends more time reading novels than time spent reading short stories. If we want to indicate the relative NUMBER of novels read, we do the following in English: 14a. John reads more novels than short stories. But how do we do this in our sample language? When we discussed counts and measures, we made an important distinction between counts which were explicitly linked to an argument of the verb, and those which were not linked, but which modified the verb directly. Thus, a P-s adverb had the meaning 'being N in quantity' when it linked to a countable noun, while the verb-modifying "0" form had the meaning 'being N in frequency'. We can solve our problem by applying the same reasoning to comparatives; i.e. by directly modifying the noun rather than the verb. The most direct way to link the comparative directly to an argument rather than to the verb is to use an open adjective form: 14b. John reads more novels than short stories. = John reads novels gejobie short stories. Note that we also could have used the P/F-s form "gebie" with exactly the same meaning. A good paraphrase of "gebie" is 'in greater quantity/amount than'. In other words, since an open adjective modifies nouns rather than verbs, it implies 'quantity' rather than 'degree', 'duration', or 'frequency' when linking countable nouns. Also, by using an open adjective, there is never any ambiguity about which arguments are being compared, because an open adjective always links to a specific noun if the syntax is not ambiguous. Now, how do we represent the following? 15a. John reads more novels than Bill. Here, we are comparing the number of novels that John reads with the number of novels that Bill reads. And since items compared must be comparable, we cannot say: 15b. John reads more novels than Bill. = *John reads novels gebie Bill. The 15b translation implies that the novels read by John outnumber a quantity represented by the noun "Bill", which is gibberish. It makes no sense to compare novels with humans. Thus, what we need is this: 15c. John reads more novels than Bill. = John reads novels gebie novels which Bill reads. = John reads novels gebie Bill's. Here, "Bill's" is the genitive noun formed with the CCM "-xa-", which we discussed earlier. It could also be expressed as "novels mabie Bill", where "mabie" is the generic P/F-s open adjective. Another possibility would be to use "X mabie Bill", where "X" is an anaphor for the first occurance of "novels". [I'll have more to say about anaphora later.] It is also possible to have a 'degree' interpretation for nouns, but these will always be stative constructions, and will use the case tag "gejope": 16. John is more of a fighter than Bill. = John is a fighter gejope Bill. 17. John is more of a whiner than a fighter. = John is a whiner gejope a fighter. Note that 16 and 17 are not ambiguous because "gejope" can only link to an argument of the verb that is comparable to ITS argument. Thus, "John" and "Bill" are being compared in 16, while "whiner" and "fighter" are being compared in 17. Now, let's look at some different kinds of comparative constructions: 18. John likes taller girls than Louise. = John likes girls that are_tall gejope Louise. 19a. John likes taller girls than Bill. = John likes girls that are_tall gejope girls that Bill likes. Here, our sample language must resort to using relative clauses, just as we used embedded sentences in examples 10a and 11a. However, English constructions like 18 and 19a are almost always ambiguous, as in: John knows wealthier people than Bill. which can have two meanings: John knows people who are wealthier than Bill. John knows people who are wealthier than the people Bill knows. Note, though, that 18 and 19a ARE ambiguous, if you look closely enough. The sexual differences simply make one interpretation much more likely than the other. Ambiguities like these must be resolved by using relative clauses in both English and our sample language. However, as we did with 15 above, we can make 19a more efficient using a genitive/associative construction: 19b. John likes taller girls than Bill. = John likes girls that are_tall gejope Bill's. Now, let's look at some different kinds of examples: 20. Few people eat as much as John. = Few people eat kapsujope or gejope John. Here, English "as much as" actually means "as much or more than". You could also state this as "Most people eat less than John". Even better, we can use the single word "sonajope" meaning 'not less than' = 'the same or more than'. [Incidentally, inherently numeric words such as "few", "all", "most", "a majority of", "almost none/all", etc. are examples of the non-specific count words we discussed in the section on counts and measures, and they should be created in the same way. Even the word "most", as in "Most dogs bark" is a quantifier meaning 'more than half' - it is NOT a comparative!] Here's an example where quantity is obvious: 21. John broke more windows than Bill. = *John broke windows gejope Bill's. = John broke windows gebie Bill's. Note that the case tag "gejope" cannot be used here, since it always modifies the verb. Since we are simply comparing countable items, the open adjective must be used. Now, let's look at some really tough ones: 22. The more he complains, the louder they play the music. = They play the music loud gelape when/if he complains gelape. Here, "-la-" is the "0" verb classifier, creating an adverb that directly modifies the verb. It can be best paraphrased as 'to a greater degree'. Here's another one: 23. The more I study, the less I know. = I know solape when/if I study gelape. Here, "solape" can be paraphrased as 'to a lesser degree'. 24. The fewer friends we have, the lonelier we are. = We are_lonely gelape when we have sono friends. Here, "sono" is the numeric adjective meaning 'fewer' or 'less'. 25. He is most happy when he is well fed. = He is_happy pilape when he is well fed. What's nice about "-la-" adverbs is that we never have to resort to relative clauses because we never specify either of the two entities or events being compared. As always, of course, we can use more specific forms (e.g. P-s) if needed. 26. John had more money than Bill thought (he had). = John had money gejobie what Bill thought (he had). 27. John baked more pies than Bill told him to (bake). = John baked pies gejobie what Bill told him to (bake). Later on in this monograph, we'll discuss how to derive words like "what" in the above sentences. Here are some more examples: 28a. More people stayed late than left early. = People that stayed late gesi people that left early. A sentence like the above would probably use an anaphor such as "those", referring to an earlier word such as "guest", "visitor", etc., rather than repeating the word "people". Another possibility is: 28b. More people stayed late than left early. = People stayed late gejope people left early. Here, we are really comparing degree rather than quantity, but the final meaning is essentially the same. A better English translation would be something like "People stayed late to a greater extent than people left early". In other words, we are comparing the 'stay late' event with the 'leave early' event. A third possibility would be to use the P/F-s verb form: 28c. More people stayed late than left early. = People stayed late gepe people who left early. By using the P/F-s form, we are directly linking to the patient of the main verb, rather than directly modifying the verb. Thus, by directly linking to a countable noun, this construction compares quantity rather than frequency. 29. More than ten people showed up. = Daigeno people showed up. In 29, "daigeno" is a simple numeric adjective. 30. I ate more because I was still hungry. = I ate geda because I was still hungry. In 30, the noun "geda" can only indicate quantity. If we wished to indicate degree, we would have to use the adverb "gelape". 31. John called Bill more than 10 times yesterday. = John called Bill daigelape yesterday. Here, "daigelape" is a simple numeric adverb. If we wished to represent the meaning 'exactly ten times', we would use "dailape". If we wished to represent the meaning 'ten more times', we would use "gedailape". The following are relatively straightforward: 32. John wants a longer report than this. = John wants a report which is_long gejope this. 33. John can run faster than Bill. = John can run fast gejope Bill. 34. You can buy a less expensive car here than at other dealers. = You can buy a car that is_expensive at here sojope at other dealers. [Note above that "here" will have to be in its noun form.] 35. You can buy a less expensive car here. = You can buy a car that is_expensive solape here. [Note above that "here" is an adverb.] 36. Here is a less expensive car. = Here_is a car which is_expensive solape. 37. John has more reason to like her than Bill does. = John's reason for liking her gesi Bill's. OR = (Why John likes her) gesi (Why Bill does). Previous-word modifying forms can also be useful if we need to be specific in our comparisons of adverbial concepts. Consider the following example: 38a. John can kick a football far gejope Bill. Since "gejope" is inherently vague, 38a has three possible interpretations: degree: John can kick a football farther than Bill. duration: John can kick a football far, over a longer period of time than Bill. frequency: John can kick a football far, more often than Bill. We can be more specific by using a previous-word form and opening up its argument structure (terminator = "-nia-"), as in the following examples: 38b. John can kick a football farther than Bill. = John can kick a football far gejonia Bill. 38c. John can kick a football far, over a longer period of time than Bill. = John can kick a football far long gejonia Bill. [Here, "long" is an adverb indicating temporal duration.] 38d. John can kick a football far, more often than Bill. = John can kick a football far often gejonia Bill. I believe that the above examples cover just about any kind of comparative construction that you're likely to run into. I decided to use such a large number of examples because natural languages can be extremely idiosyncratic in the way that they implement comparatives, especially the more complicated constructions. An exhaustive analysis should (hopefully) make it less likely that an AL designer will adopt a system that is simply a clone of his natural language. Now, many (and perhaps most) of the languages of the world do not make a distinction between comparative and superlative. Instead, they generally rely on context or emphasis when needed. For example, note how the uses of "more" and "most" are mutually exclusive in the following English sentences: *He is the more disgusting person in the choir. *He is the most disgusting one of the twins. If you wish to emulate languages that do not make this distinction, you can start with three words instead of five as we did here. For example, you could use "gejope" in place of both "pijope" and "gejope", and "sojope" in place of both "jujope" and "sojope". Here are two examples using this approach: You can buy a less expensive car here. = You can buy A car which is_expensive solape here. You can buy the least expensive car here. = You can buy THE car which is_expensive solape here. Note the different uses of the articles "A" and "THE". Some natural languages (e.g. French) use this same approach to differentiate between comparative and superlative. Finally, in many cases, it may be easier to apply the scalar polarity MCMs directly to the word being modified, as we mentioned earlier. Here are some more examples using the word "xauno", meaning 'hot': xaugeno = hotter xaugebie = hotter than xausono = less hot xausobie = not as hot as xaupino = the hottest xaujuno = the least hot And so on. 14.2 PARTICLE COMPARATIVES As we saw above, verbal comparatives are quite productive and fit in well with the lexical/semantic system being proposed here. They do, however, have a serious disadvantage. Different morphological forms of a comparative must be used in different syntactic environments. Thus, for example, the single concept 'more' is represented by several different words depending on what is being compared (e.g. "gejope", "gesi", "gejonia", "gejobie", etc.). A particle, however, does not change its form and will have a precisely defined scope. As we will see, though, particle comparatives are only slightly less productive and do not suffer the disadvantage of verbal comparatives. In order to get an idea of how to most effectively implement particle comparatives, let's look at a few examples that vary only slightly, and see if we can make some generalizations about them (I will use parentheses plus the particle "more" to show which item is greater in quantity or degree): John reads novels more than Bill. John (more reads) novels vs. Bill reads novels i.e., different verbs, different subjects, same objects John reads more novels than Bill. John reads (more novels) vs. Bill reads novels i.e., same verbs, different subjects, different objects John reads novels more than short stories. John (more reads) novels vs. John reads short stories i.e., different verbs, same subjects, different objects John reads more novels than short stories. John reads (more novels) vs. John reads short stories i.e., same verbs, same subjects, different objects It seems that there are three constituents (verb, subject, and object) which can have either of two values (same or different). This suggests that there could be up to eight possible combinations. Here is a list of all of the possibilities: 1. same verb, same subject, same object This is not a comparison since nothing is different. 2. same verb, same subject, different object John reads more novels than short stories. John reads (more novels) vs. John reads short stories 3. same verb, different subject, same object More women read novels than men. (more women) read novels vs. men read novels 4. same verb, different subject, different object John reads more novels than Bill. John reads (more novels) vs. Bill reads novels 5. different verb, same subject, same object John writes novels more than he reads them. John (more writes) novels vs. John reads novels 6. different verb, same subject, different object John reads novels more than short stories. John (more reads) novels vs. John reads short stories 7. different verb, different subject, same object John reads novels more than Bill. John (more reads) novels vs. Bill reads novels 8. different verb, different subject, different object John reads novels more than Bill writes short stories. John (more reads) novels vs. Bill writes short stories. OR John reads more novels than Bill writes short stories. John reads (more novels) vs. Bill writes short stories. Note, though, that we are really comparing only two items: a component of two sentences, plus the two sentences themselves. In order to deal unambiguously with all possible combinations, I suggest the following simple rules: a. When there is a difference in only one constituent, link the two items being compared with the single compound expression "more than". In other words, use "more than" in the same way as you would use a simple conjunction. b. When there is a difference in two or three constituents, apply the particle "more" to the item which is greater in magnitude, and link the two contrasting items with the "than" particle. In other words, use "more" like a modifier and use "than" like a conjunction. (Since our sample language uses English word order, the particle "more" will PRECEDE the item it applies to. The particle "than" will be placed BETWEEN the two items it links.) Rule (a) applies to examples 2, 3, and 5: 2. same verb, same subject, different object John reads more novels than short stories. John reads (more novels) vs. John reads short stories = John reads novels more-than short stories. 3. same verb, different subject, same object More women read novels than men. (more women) read novels vs. men read novels = Women more-than men read novels. 5. different verb, same subject, same object John writes novels more than he reads them. John (more writes) novels vs. John reads novels = John writes more-than reads novels. Rule (b) applies to examples 4, 6, 7, and 8: 4. same verb, different subject, different object John reads more novels than Bill. John reads (more novels) vs. Bill reads novels = John than Bill reads more novels. 6. different verb, same subject, different object John reads novels more than short stories. John (more reads) novels vs. John reads short stories = John more reads novels than short stories. 7. different verb, different subject, same object John reads novels more than Bill. John (more reads) novels vs. Bill reads novels = John than Bill more reads novels. 8. different verb, different subject, different object John reads novels more than Bill writes short stories. John (more reads) novels vs. Bill writes short stories. = John more reads novels than Bill writes short stories. OR John reads more novels than Bill writes short stories. John reads (more novels) vs. Bill writes short stories. = John reads more novels than Bill writes short stories. Note that when "more" modifies a countable noun, such as "novels", it indicates quantity. When it modifies a verb such as "reads", it indicates degree, duration, or frequency. Also note that (8) can never be ambiguous. In the first example, "more" modifies "reads", while "than" appears to link with "Bill". However, since a verb cannot be compared with a noun, "than" actually marks the entire clause that follows it. The same logic applies to the second example. In effect, the verb is the head of its clause. This would be obvious if the word order of the sample language were VSO or SOV instead of SVO. Now, let's see if we can apply the rule to oblique expressions: 1. John needs more money in Boston than in New York. John needs (more money) in Boston vs. John needs money in New York = John needs more money in Boston than New York. 2. John needs money more in Boston than in New York. OR John needs money in Boston more than in New York. John (more needs) money in Boston vs. John needs money in New York = John more needs money in Boston than New York. 3. John needs more money for a van than for a car. John needs (more money) for a van vs. John needs money for a car = John needs more money for a van than car. 4. John needs money more for a van than for a car. John (more needs) money for a van vs. John needs money for a car = John more needs money for a van than car. 5. John needed more money for a car than Bill needed. John needed (more money) for a car vs. Bill needed money for a car = John than Bill needed more money for a car. 6. John needed money for a car more than Bill needed money for a car. John (more needed) money for a car vs. Bill needed money for a car = John than Bill more needed money for a car. Note in all of the above, that the case tags "in" and "for", and the article "a" were not repeated. They could have been included without changing the meaning but would have been redundant. Thus, 4 could have been written "John more needs money for a van than FOR A car" or "John more needs money for a van than A car". Next, let's see if we can deal easily with adverbs: 7. John can run faster than Bill. John can run (more fast) vs. Bill can run fast = John than Bill can run more fast. 8. John can run fast more than Bill. John (more can) run fast vs. Bill can run fast = John than Bill more can run fast. OR John can (more run) fast vs. Bill can run fast = John than Bill can more run fast. 9. John can kick a football farther than Bill. 'far' which John can kick > 'far' which Bill can kick (first word = "far", 2nd difference = "Bill") = John can kick a football more far than Bill. Will this approach work for superlatives? Let's try: 10. John is the tallest student in the class. John (most is_tall) vs. students in the class are_tall [Note that we are using the P-s verb meaning 'to be tall'.] = John than students in the class most is_tall. Keep in mind, though, that "more" can also be used here in place of "most" with the same meaning. Thus, we can also say "John than students in the class MORE is_tall". As I stated earlier, many languages do not make a distinction between comparative and superlative. Now, how do we represent these particles? Particles are closed-class words, they are invariant in form, and they are small in number. They rarely, if ever, undergo further derivation. Because of this, I feel that it would be best to assign forms to particles by using the particle terminator "-ka". For the comparative particles, I will arbitrarily create the following words: taka 'than' geka 'more' pika 'most' kapsuka 'as much/many as' soka 'less' juka 'least' We will also allow the following abbreviated forms: getaka = "geka taka" 'more than' pitaka = "pika taka" 'the most among' kapsutaka = "kapsuka taka" 'as much/many as' sotaka = "soka taka" 'less than' jutaka = "juka taka" 'the least among' Now, let's apply these particles to all of the sample sentences that we used with verbal comparatives. To save space and time, I will only provide detailed analyses for the more difficult examples: 1. John is taller than Bill. = John taka Bill geka is_tall. 2. John is as tall as Bill. = John taka Bill kapsuka is_tall. [Here, "taka...kapsuka" corresponds to English "as..as".] 3. John is less tall than Bill. = John taka Bill soka is_tall. 4. John is not as tall as Bill = John is less tall than Bill. = John taka Bill soka is_tall. 5. John is not the same height as Bill. = John taka Bill not kapsuka is_tall. [We'll derive the word meaning 'not' later, when we discuss _modality_.] 6. John is the tallest of the three brothers. = John taka the three brothers pika is_tall. 7. John is the tallest student in the class. = John taka students in the class pika is_tall. 8. John is more quiet than shy. = John is_quiet geka taka is_shy. = John is_quiet getaka is_shy. 9. John helps Bill more than Mike. The inherent ambiguity of this sentence cannot be represented using particle comparatives. To capture the ambiguity, you must use the verbal comparative "gejope". 10. John helps Bill more than he helps Mike. = John geka helps Bill taka Mike. 11. John helps Bill more than Mike does. = John taka Mike geka helps Bill. 12. Kids join gangs in Boston more than in Cowtown. = Kids geka join gangs in Boston taka Cowtown. 13a. John reads novels more than Bill. = John taka Bill geka reads novels. 13b. John reads novels more than short stories. = John geka reads novels taka short stories. 14. John reads more novels than short stories. = John reads novels geka taka short stories. = John reads novels getaka short stories. 15. John reads more novels than Bill. = John taka Bill reads geka novels. 16. John is more of a fighter than Bill. = John taka Bill geka is a fighter. 17. John is more of a whiner than a fighter. = John geka is a whiner taka fighter. OR John is geka a whiner taka a fighter. [BUT NOT "John is a whiner geka taka fighter". This is because "is a whiner" and "a whiner" are not countable, while "whiner" and "fighter" ARE countable. Thus, if we use the third form, we are saying that John is somehow equal to a quantity of whiners and fighters, with whiners outnumbering fighters.] 18. John likes taller girls than Louise. John likes (more tall) girls vs. John likes Louise = John likes geka tall girls taka Louise. 19. John likes taller girls than Bill. John likes (more tall) girls vs. Bill likes tall girls = John taka Bill likes geka tall girls. 20. Few people eat as much as John. Few people (as much or more eat) vs. John eats = Few people taka John kapsuka or geka eat. 21. John broke more windows than Bill. = John taka Bill broke geka windows. 22. The more he complains, the louder they play the music. he (more complains) vs. unspecified AND they play the music (more loud) vs. unspecified = They play the music geka loud when he geka complains. 23. The more I study, the less I know. = I soka know when I geka study. <- emphasizes degree = I know soka da when I study geka da. <- emphasizes quantity [The completely generic noun "da" can be paraphrased here as "things". In our sample language, "soka" cannot stand alone - it must modify something.] 24. The fewer friends we have, the lonelier we are. = We geka are_lonely when we have soka friends. 25. He is most happy when he is well fed. = He pika is_happy when he is well fed. 26. John had more money than Bill thought (he had). John had (more money) vs. John had money that Bill thought he had = John had money geka taka what Bill thought (he had). = John had money getaka what Bill thought (he had). 27. John baked more pies than Bill told him to (bake). = John baked pies geka taka what Bill told him to (bake). = John baked pies getaka what Bill told him to (bake). 28. More people stayed late than left early. (more people) stayed late vs. people left early = Geka people stayed late taka left early. 29. More than ten people showed up. = Geka ten people showed up. 30. I ate more because I was still hungry. = I geka ate because I was still hungry. <- emphasizes degree OR I ate geka da because I was still hungry. <- emphasizes quantity 31. John called Bill more than ten times yesterday. = John called Bill geka ten times yesterday. [Here, "ten times" is the numeric adverb "dailape". We can also use the single word "daigelape" instead of "geka dailape".] 32. John wanted a longer report than this. John wanted a (more long) report vs. John wanted this report = John wanted a long geka taka this report. = John wanted a long getaka this report. [In the above example, "this" is the deictic adjective "mimpano". You can paraphrase it as "John wanted a more-long-than-this report.] OR John wanted a report which taka this geka is_long. [In the above example, "this" is the deictic noun "mimpada".] 33. John can run faster than Bill. = John taka Bill can run geka fast. 34. You can buy a less expensive car here than at other dealers. = You can buy soka expensive car here taka at other dealers. [In the above example, "here" is the adverb "mingupe" and "at" is the case tag "mepe".] 35. You can buy a less expensive car here. = You can buy a soka expensive car here. 36. Here is a less expensive car. = Here is a soka expensive car. 37. John has more reason to like her than Bill does. John (more has) reason vs. Bill has reason = John taka Bill geka has reason to like her. 38. He ate more than twice as much as I ate. = He ate geka twice what I ate. 39b. John can kick a football farther than Bill. = John taka Bill can kick a football geka far. 39c. John can kick a football far, over a longer period of time than Bill. = John taka Bill can kick a football far geka long. 39d. John can kick a football far, more often than Bill. = John taka Bill can kick a football far geka often. Note that in all the examples, "taka" can be paraphrased in English as "compared to" or "in comparison with". For superlatives, the paraphrase would be "among". As I mentioned at the beginning of this section, particle comparatives are slightly less flexible than verbal comparatives, but are likely to be easier to learn. However, I strongly feel that BOTH methods should be available for use in an AL. Students can learn to actually USE the form closest to their native language while learning to RECOGNIZE the other form. 14.3 UNFOCUSED COMPARATIVES Natural languages have several words which indicate degree or quantity relative to an IMPLIED referent. In other words, these words have an unspecified focus. Here are some English examples: Excessive degree: He is TOO happy now. Maximum degree: He is EXTREMELY/MOST happy now. High degree: He is QUITE/VERY happy now. Low degree: He is NOT TOO/SLIGHTLY/SOMEWHAT happy now. Minimum degree: He is HARDLY/BARELY happy now. Slightly less than unmarked degree: He is ALMOST/NOT QUITE happy now. Emphatic degree: He is DEFINITELY/ABSOLUTELY happy now. Emphatic zero degree: He is NOT happy AT ALL/WHATSOEVER now. Exclusive degree: He is JUST/ONLY happy. Note that the maximum, high, low, and minimum degrees are already represented by the scalar polarity MCMs. Additional MCMs can be created for the other degrees. The terminator can change depending on what the word modifies. For example, the 'high' and 'exclusive' degree words can have the following forms: -pe, Verb modifier: He studies VERY MUCH/A LOT. When she's here, he JUST studies. He's ONLY A poet (and nothing else). -no, Noun modifier: He is QUITE A poet. It's a SIGNIFICANT/GREAT achievement He's THE ONLY poet. -di, Previous-word modifier: He's a VERY happy person. He studies ONLY when she's here. [Note that, in English, these words precede the word they modify. In the sample language, "very" would follow "happy" and "only" would follow "when".] The actual implementation is quite straightforward, and I will not spend any more time on it here. 14.4 COMPARATIVE FACTORS AND DIFFERENCES We often make comparisons in which we specify the magnitude of the difference between the entities being compared. Consider the following: 1. The rope is longer than the stick. 2. The rope is half as long as the stick. 3. The rope is less than half as long as the stick. 4. The rope is three meters longer than the stick. Example (2) has a simple solution, if we paraphrase it: The rope lengamasi fevdeduno the stick. = The rope has the length of half of the stick. = The rope is half as long as the stick. where "lengamasi" is the P/F-s version of the verb meaning 'to be long', and "fevdeduno" is the numeric adjective meaning 'one-half'. We can also implement it using particle comparatives as: The rope taka fevdeduno the stick kapsuka is long. = Literally: The rope compared with half the stick is as long. = The rope is as long as half of the stick. = The rope is half as long as the stick. Example (3) also has two similar solutions: The rope lengamasosi fevdeduno the stick. = The rope has less length than half of the stick. = The rope is less than half as long as the stick. or, using particle comparatives: The rope taka fevdeduno the stick soka is long. = Literally: The rope compared with half the stick is less long. = The rope is less long than half of the stick. = The rope is less than half as long as the stick. But how do we handle (4), where the difference is not only additive, but also contains the unit of measure "meters"? Again, we have two simple solutions if we paraphrase the English: The rope lengamasi metada gezibie the stick. = The rope is as long as three-more-than-the-stick meters. = The rope is three meters longer than the stick. or, using particle comparatives: The rope lengamasi zino metada getaka the stick. = The rope is as long as three meters more than the stick. = The rope is three meters longer than the stick. where "gezibie" is the open adjective meaning 'three more than', and "metada" is the measure word meaning 'meter'. We can also easily add a comparative to the verb, creating a double comparative: The rope lengamagesi metada gezibie the stick. = The rope is longer than three-more-than-the-stick meters. = The rope is more than three meters longer than the stick. or, using particle comparatives: The rope lengamagesi zino metada getaka the stick. = The rope is longer than three meters more than the stick. = The rope is more than three meters longer than the stick. Note that all of the above can also be implemented using verbal comparatives. I will leave that task as an exercise for the interested reader. 15.0 DIMINUTIVES AND AUGMENTATIVES We've seen how useful the scalar polarity MCMs can be in deriving many new words. So far, though, we've only applied them to stative (i.e. verbal) concepts. Fortunately, they can be just as useful and productive when applied to nouns. In doing so, we will be creating words that are commonly known as _diminutives_ and _augmentatives_. As we mentioned earlier, when MCMs are applied to basic nouns, the root plus its basic classifier is treated as a single EFFECTIVE root, since the root by itself has only mnemonic value. Thus, in defining the semantics of diminutives and augmentatives, we must refer to the combination, rather than to the root by itself. For this reason, MCMs used to create augmentatives and diminutives will always follow the noun classifier. In my opinion, the best way to define the semantics of diminutives and augmentatives is as follows: When a polarity MCM is applied to a basic noun, it magnifies or reduces the SIZE, INTENSITY, or both of the entity, in proportions that are most natural or typical for the entity. When no polarity MCM is used, it indicates a generic or typical entity that may include all possible sizes and/or intensities. For easy reference, here are the scalar polarity MCMs again: -pi- 'maximally', 'extremely' -ge- 'very', 'highly' -so- 'not too', 'not very' -ju- 'minimally', 'barely', 'hardly' Now, here are a few examples: xumpijida = 'snowfall' xumpijipida = 'blizzard' xumpijigeda = 'snowstorm' xumpijisoda = 'snow flurries' xumpijijuda = 'a light dusting of snow' guanaida = 'lake' guanaipida = 'great lake' guanaigeda = 'large lake' guanaisoda = 'pond' guanaijuda = 'pondlet', 'water hole' guajida = 'hot spring' guajigeda = 'geyser' guajijuda = 'mudpot' Now, keep in mind that when a scalar polarity MCM is immediately preceded and/ or followed by a numeric morpheme, it will have a comparative interpretation rather than an augmentative or diminutive interpretation. For example, the word "guajigezida" means 'three more hot springs'. It does NOT mean 'three geysers'. If we wish to say 'three geysers', we must use the expression "zino guajigeda". I'm not sure if it's wise to apply polarity MCMs to living species, because species are discreet and unique. The polarity MCMs are ideal, however, for entities that span a range of sizes and intensities, such as 'lakes', 'geysers', 'storms', etc. For living species, it would probably be more useful to define approximate age ranges so that the polarity MCMs CAN be applied to living species as well as to human institutions. These would be defined in human terms, and those which are also applicable to other species could be used when needed. For example, we could use them as equivalents to the English diminutive "-let" in "piglet" or to the diminutive "-ling" in "yearling". Here is one possible scheme: -pi- 'elder' -ge- 'adult' -so- 'child' -ju- 'infant' Keep in mind that "-pi-" derivations are a subset of "-ge-" derivations, and that "-ju-" derivations are a subset of "-so-" derivations. Thus, an 'elder' is an 'adult', but an 'adult' is not necesarily an 'elder'. Ditto for 'child/ infant'. It would also be useful to have two MCMs to indicate positive and negative QUALITY. The positive MCM would have the very general meaning of 'good' or 'desirable', and the negative MCM would have the very general meaning of 'bad' or 'undesirable'. These MCMs could be used to make distinctions such as the following: Negative MCM Unmarked Positive MCM ------------ -------- ------------ 'hovel/dump' 'house' '(nice) home' 'rascal/scamp' 'person' 'good fellow/nice guy' 'infamous/notorious' 'famous/renowned' 'celebrated/popular' 'cur' 'dog' 'pooch' 'scrawl/scribble' 'write' 'scribe' 'brat' 'child' 'angel (figurative)' Some languages (e.g. Russian and Italian) use diminutive or augmentative morphology to indicate that the speaker feels affection or dislike for the named individual. English does something similar in expressions like "doggie", "sweetie", etc. There is really no need to create new MCMs to perform these functions, since they are already included in the meaning of the quality MCMs; i.e., one automatically has a positive attitude towards something that is 'good' or 'desirable'. If a speaker really needs to emphasize affection or dislike, he can use appropriate adjectives such as English "dear" or "despised". The positive and negative quality MCMs can also be used to make the distinction between beneficiary and maleficiary, as we discussed earlier: He voted IN FAVOR OF the resolution. He voted AGAINST the resolution. In summary, we will be able to use the complete set of scalar MCMs to create augmentatives and diminutives based on size and/or intensity, a single 'positive quality' MCM, and a single 'negative quality' MCM. If we wish to combine magnitude AND quality, we can use two MCMs. For example, the word for 'mansion' could be created with a polarity MCM to indicate size and with the positive quality MCM to indicate quality. We could also use a quality MCM twice to indicate extreme positive quality (perhaps related to royalty?) or extreme negative quality. 16.0 REGISTER VARIATIONS (HONORIFICS AND PEJORATIVES) Many languages have words or morphemes that indicate the social status of the speaker relative to the listener or to a third party. The most common way of marking these differences is by means of special pronouns. For example, a more polite 2nd person pronoun can be used when speaking to a superior or elder. However, these distinctions are not only made with pronouns. There are also many words, other than pronouns, that are only used in certain social contexts. For example, most English speakers will use the words "shit", "crap", "feces", "do-do", and "number 2" in entirely different settings, depending on who they are speaking with. In fact, some speakers will completely avoid using certain words, either because they are too formal or too rude. For example, many speakers will not use the 'dirty' word "shit" at all, while others may not use 'big' words like "explicate" or "obfuscation", or 'pretty' words like "lovely" or "marvelous". Some languages also have words that differ in register that are effectively REQUIRED in certain contexts. Cambodian is a language that is especially rich in this respect. For example, there are three completely different words that mean 'to sleep'. The first is used when the sleeper is a superior or someone especially deserving of respect; the second is used when the sleeper is the speaker or a person of equal status; and the third is used when the sleeper is of lower status. Words or morphemes that indicate respect are normally called _honorifics_, while those which indicate disrespect are called _pejoratives_. In my opinion, the best way to deal with these register variations is to create special MCMs for honorifics, pejoratives, and other register variations. This approach is similar to the honorific affixes of Korean and Japanese. To implement this in our sample language, I will use the following MCMs: -xemna- fawning, subservient (macro "xemna" = "tenko + ge") -tenko- humble, inferior -mio- very polite, very formal, very respectful (macro "mio" = "zai + ge") -zai- polite/formal/respectful - used in formal speech and writing -cau- informal, slang - commonly used but not in formal speech or writing e.g. "sleazy", "goofy", "doll", "cop", "(nice and) cozy", "humongous", "ain't", "vamoose", "hunk", "nerd", "dude", "pig" = 'sloppy person' -loi- contemptuous, rude, insulting e.g. "jerk", "chink", "fuz", "whore", "wop", "kike", "queer", "faggot", "nigger", "pig" = 'policeman' -pie- vulgar, filthy, tasteless e.g. "shithead" and all 'forbidden' words and expressions Also, the above are just SOME of the possible registers. You may also want to create MCMs for other registers, such as 'macho' and 'effeminate', or an MCM to indicate conceit/superiority. Note that "-xemna-" and "-mio-" are actually macros. Other degrees can also be created. For example, if we wanted to create a register indicating 'groveling' or 'total abasement', we could use "tenkopi" or even "xemnapi". 16.1 LEXICAL REGISTER The register MCMs can be applied to any word to indicate its social context. When intentionally used in the WRONG context, they would be interpreted as rudeness, unacceptable familiarity, excessive fawning, etc. An unmarked expression would be interpreted as 'neutral', and would be used in the vast majority of cases. Pronouns of varying degrees of politeness can be easily formed. For example, the 2nd person pronoun "dustuda", meaning 'you', would become: dustumioda = old English "thee", French "vous", German "Sie", etc. dustucauda = old English "thou", French "tu", German "du", etc. Nouns, verbs, adjectives, etc. can also have their register changed. For example, if a neutral word meaning 'feces' is "dengafada", then the word for 'crap' would be "dengafaloida" and the word for 'shit' would be "dengafapieda". [This word is both filthy AND flowery. I'll discuss SHORTER swear words below.] It is important to emphasize that the register MCMs always reflect the attitude of the speaker toward the entity that is being modified by the MCM. For example, if the 'contemptuous' MCM is used with the pronoun meaning 'you', it shows that the speaker feels contempt for the listener. If it is used with the pronoun 'I', it indicates that the speaker feels contempt for himself. If the 'humble' MCM is used with the pronoun 'you', it shows that the speaker feels humble in the presence of the listener. If it is used with the pronoun 'I', it indicates that the speaker feels humble in his OWN presence, as if he were in awe of himself or something he just did. We can also use the register MCMs directly as roots, but we first need to define the corresponding state or action. When a register is added to speech, the user is acting in a way that is intended to have an effect on the patient. In other words, register concepts are very similar to speech acts. This becomes more obvious when we consider that the final state of the patient that the agent is attempting to achieve is simply not certain. For example, what state are we trying to achieve when we act humble, or polite, or chummy, or rude? The answer is that the state, if expressable at all, is so vague or even ambiguous that it is useless. Thus, it is best to use the register morphemes as action roots. So, when register morphemes are used as roots, the agent will be acting in a way that may affect the patient. And, as with all action verbs, the focus, if present, will simply elaborate the action. With this in mind, it seems that the best default class for these derivations is A/P-p. Here are some examples: xemnasi = to grovel before, to humble oneself before xemnano = subservient, very humble xemnadeno = august, exalted, sublime ("-de-" = middle CCM) tenkosi = to act inferior with, to be humble towards tenkono = humble/inferior tenkodeno = great, lofty, eminent, noble miosi = to be very polite to miodeno = esteemed, honored, highly regarded or respected zaisi = to be polite to, to act respectful to zaino = polite, respectful zaideno = respectable, deserving of respect caudeno = ordinary/okay/alright/cool/just like us e.g. "He's an okay guy" loisi = to be rude to, to act rude with/towards loino = rude, contemptuous loideno = stupid/stinking/cussed/damned loideda = jerk, asshole, shithead piesi = to swear obscenely at, to be profane with, to use dirty language at pieno = vulgar, obscene, filthy, profane piedeno = f*cking pievida = f*ckhead, f*cking thing Using the above scheme, some swear words are likely to be quite long. Since some people might object to using such long forms for 'dirty words', we could do exactly the opposite - create particle versions of words that are especially colloquial or rude. These would be formed exactly like other particles, and would therefore require dictionary look-up. For example, if the word for 'feces' is "dengafada", then the word for 'crap' could be "dengaka" and the word for 'shit' could be "deka". Also, the dictionary entries could simply list the reduced form along with its longer form, rather than providing a complete definition, as in the following examples: dengaka = dengafaloida deka = dengafapieda This will be especially useful for a dictionary of a language IN the same language. If you do create particle forms, the longer forms can still be used as more socially acceptable versions of otherwise forbidden words (e.g. "that crepitatious person" instead of "that f*ckhead"). Finally, the use of pejoratives is preferable to using metaphor (e.g. "pig") since pejoratives are culturally neutral and will always be understood. [I'll have more to say about the dangers of metaphor later.] 16.2 SENTENTIAL REGISTER It will also be useful to apply register to complete utterances; i.e., by having a register word modify a complete sentence. In the sample language, we will accomplish this by creating a new part-of-speech terminator, "-fo". A word that ends in "-fo" will take a complete clause or sentence as its only argument. Here are a few examples: 1. Tenkofo may I leave now? <- humble 2. Caufo I'm leaving now. <- slang 3. Loifo why did you do it? <- insulting Example (1) would have the sense of the sentence "I humbly request permission to leave now", (2) would be equivalent to "Hey, I'm splittin' now", and (3) would have the flavor of the English sentence "You louse! Why did you do it?". Similarly, the polite derivation "Zaifo" would be equivalent to the English word "please". In other words, when "-fo" terminates a word whose root is a register morpheme, the speaker is the effective agent of the register act, the listener is the effective patient, and the embedded sentence is the effective focus, since it is the elaboration of the act of humility, politeness, rudeness, etc. Note, though, that use of "-fo" is NOT the same as use of an A/P/F-p verb plus a double middle or double passive CCM, because use of the CCMs may not indicate the agent or patient of the register act. With "-fo", there is never any doubt about who the agent and patient are. 17.0 TENSE AND ASPECT _Tense_ marks the temporal location of an event as being either before, during, or after a particular reference time. The reference time is typically the moment of speech, but not always. Consider the following example: She told me that Bill had broken the window. Here, the tense of the main clause is relative to the moment of speech, while the tense of the subordinate clause is relative to the time of the main clause. Thus, while tense often resembles deictics, it is not a true deictic itself. It simply describes the temporal state of an event relative to another time, which may or may not be the moment of speech. [Cf. Our earlier discussion in the section on deictic locatives, and how a word such as "nearby" is NOT a true deictic, while a word like "here" IS a true deictic. For similar reasons, tense is a normally unfocused state concept - NOT a deictic.] Tense has three basic values: past, present, and future. However, natural languages often have additional tenses that are variations of the three basic tenses, such as 'immediate past', 'remote future', as well as different forms for relative tenses. _Aspect_ marks the temporal 'shape' of the event, and whether the event is being viewed from the 'inside' or from the 'outside'. There are two general aspects that apply to all events, and several more specific ones. Here are the two general aspects: Perfect or Perfective: The event is considered to be a single, bounded unit, viewed from the outside; i.e., the event is completed. e.g. Past: John sang the song. Present: In this report, we show that... Future: John will sing the song. Imperfect or Imperfective: The event is considered to be a range of points in time, viewed from somewhere within the range; i.e., the event is in progress. e.g. Past: John was singing the song. Present: John is singing the song. Future: John will be singing the song. [The aspectual labels that I am using here are very common in the linguistic literature, but actual labels and their definitions vary somewhat from linguist to linguist. Also, since the words "perfect" and "imperfect" have common, unrelated, non-aspectual meanings in English, I will often use their respective synonyms "perfective" and "imperfective" instead to prevent misunderstandings.] In English, the combination of present and perfect is almost never used except in formal reports and, occasionally, in colloquial narration. Semantically, the combination is not really meaningful. If it were, it would imply that an event can be viewed as both complete and ongoing at the same time, which is self-contradictory. Because of this, natural languages will often use the present perfect form for something else. English, for example, almost always uses the present perfect form to represent a present tense generic or habitual meaning (discussed below). For example, the use of "sings" in "He sings very well" means that he HABITUALLY sings very well. It does NOT mean that he is actually singing at the present moment. Consequently, if you want the aspectual system of your AL to be semantically precise, then the unmodified present perfect should probably not be used. Doing so would imply an unnatural event that is somehow both complete and ongoing at the same time. I do not know of any situation in which this would be useful. A few languages, including English, also take advantage of certain, very common verb/tense/aspect combinations to achieve greater efficiency. For example, some verbs are almost always used with a perfective meaning in the past tense and an imperfective meaning in the present tense. Some languages will take advantage of this by using the perfect form all of the time if the perfect form is less marked (i.e. 'shorter') than the imperfect form. Here are some examples: Imperfective meaning, perfective form: John knows the answer. *John is knowing the answer. The book weighs 4 pounds. *The book is weighing 4 pounds. Perfective or imperfective meaning (depending on context), perfective form: John knew the answer. *John was knowing the answer. Imperfective meaning, perfective form: The fish stinks more than I can tolerate. *The fish is stinking more than I can tolerate. Perfective or imperfective meaning (depending on context), perfective form: The hat was too big. *The hat was being too big. [Note that all the verbs in this class are non-agentive "-s" verbs derived from state roots.] Because the English imperfective form using an auxiliary plus "-ing" is longer than the perfective form, and since only one meaning is likely, the more efficient perfective form is used instead without confusion. I suspect that this kind of crossover is only likely to occur in languages whose perfective forms are more efficient than their imperfective forms. However, it is not universal. In Turkish, for example, the less efficient but semantically correct imperfective form is used for verbs like 'know'. Thus, speakers of languages that are like English in this respect will have to be careful to ALWAYS use the imperfective form for the present tense of these verbs. They should also be careful to apply the semantically appropriate aspect when using the past and future tenses of these verbs. For example, the following sentences are all SEMANTICALLY correct in their use of the verb "know", even though they are GRAMMATICALLY incorrect in English: John is not knowing the answer now, but he did know it yesterday. John is not knowing the answer now, but he will know it tomorrow. When I asked John yesterday, he was knowing the answer. When I will ask John tomorrow, he will be knowing the answer. Keep in mind that use of the imperfective indicates that we are looking at a point in time within a range of points; in other words, we are viewing the event from the inside. Use of the perfective implies that we are looking at the event as if it were bounded; in other words, we are viewing it from the outside, as if it were a single point in time (although it could be a very 'large' point). Now, consider the following: Imperfective: John was eating when Bill left. Perfective: John ate when Bill left. The first example is not bound and can potentially extend both before and after the tense time. It's even possible that John is still eating when the sentence is uttered. The second example IS bounded. John was definitely not eating before Bill left, and was definitely not eating when the sentence was uttered. In other words, a perfective event can NOT extend outside of the boundaries imposed by the tense time. An imperfective event CAN extend beyond those boundaries. There are several aspects that are more specific than the perfect or imperfect aspects. Here is a list of the most important ones: Iterative: The event is repeated more than once on a SINGLE occasion. e.g. Past perfect: John repeatedly sang the song. Past imperfect: John kept (on) singing the song. Present perfect: John repeatedly sings the song. Present imperfect: John keeps (on) singing the song. Future perfect: John will repeatedly sing the song. Future imperfect: John will keep (on) singing the song. [Note how the general aspects apply first to the verb, and the more specific aspect then modifies the result. Tense can be applied either first or last. (English speakers and speakers of other languages that require that tense be always specified will find it easier to apply tense first. Chinese speakers and speakers of languages that do not require tense marking will probably find it easier to apply tense last.) For example, before applying the iterative aspect, we first start with past imperfect "John was singing the song". The imperfect effectively 'opens up' the event and focuses on the points within it. Next, the iterative then repeats the points within a single event. For the perfective, the singing event remains closed, and the entire, bounded event is repeated. Later, we will see how to apply aspects in different orders to achieve entirely different results.] Habitual: The event is repeated more than once on DIFFERENT occasions. e.g. Past perfect: John used to sing the song. Past imperfect: John used to be singing the song. Present perfect: John sings the song (e.g. often). Present imperfect: John is writing a book (but not necessarily at this very moment) Future perfect: John will sing the song (e.g. from now on). Future imperfect: From now on, when John is singing the song... Inceptive: Only the start point of the event is under consideration. Since the inceptive represents a point in time within a range of points, the event must first be 'opened up' before the point can be referenced. Thus, an imperfective must be applied first before the inceptive can be applied. In fact, perfective followed by inceptive is semantically meaningless; cf. "*He started sang the song". e.g. Past perfect: not possible Past imperfect: John started singing the song. Present perfect: not possible Present imperfect: John starts singing the song. Future perfect: not possible Future imperfect: John will start singing the song. Continuative: A point or range of points in time somewhere in the middle of the event is under consideration. Since the continuative represents points in time within a range of points, the event must first be 'opened up' before these points can be referenced. Thus, an imperfective must be applied first before the continuative can be applied. In fact, as was the case with inceptives, a perfective followed by a continuative is semantically meaningless. e.g. Past perfect: not possible Past imperfect: John was still singing the song. Present perfect: not possible Present imperfect: John is still singing the song. Future perfect: not possible Future imperfect: John will still be singing the song. Terminative: Only the end point of the event is under consideration. Since the terminative represents a point in time within a range of points, the event must first be 'opened up' before the point can be referenced. Thus, an imperfective must be applied first before the terminative can be applied. In fact, as was the case with inceptives and continuatives, a perfective followed by a terminative is semantically meaningless. e.g. Past perfect: not possible Past imperfect: John stopped singing the song. Present perfect: not possible Present imperfect: John stops singing the song. Future perfect: not possible Future imperfect: John will stop singing the song. Completive: The event is done to completion, reaching a natural or obvious endpoint. English generally uses the verb "to finish" or an expression such as "really" or "to completion" to indicate this aspect. Sometimes, though, the particle "up" is used, as in "to eat up", "to smash up", "to fill up", etc. e.g. Past perfect: John finished washing the dishes. Past imperfect: John was finishing washing the dishes. Present perfect: In this paper, we thoroughly discuss... Present imperfect: John is finishing washing the dishes. Future perfect: John will finish washing the dishes. Future imperfect: John will be finishing washing the dishes. [Use of the English word "thoroughly" in the above example is only approximate. The English word often implies high quality or an extreme attention to detail. The completive aspect does not have this implication.] The above definitions are, I believe, the best ones possible for an AL designer because they cover those categories of aspect that appear in most natural languages. Categories that appear in very few languages, such as 'excessive duration', 'limited duration', 'frequentative', 'partial completion', etc. can be handled by using adverbs, although the AL designer is certainly free to add these and other categories to the above list. Another possibility (which I will illustrate later) is to derive the less common aspects from the major ones. In summary, tense describes the EXTERNAL temporal state of an event, while aspect describes the INTERNAL temporal state of an event. Now, in the above list, I intentionally omitted the aspect usually referred to as _generic_. Here are some examples: Squirrels live in trees. Americans produce too much garbage. Sapphires cost more than diamonds. Dogs bark when the moon is full. Many (and perhaps most) languages use the same form for both habitual and generic aspects. This is possible because the subject of a habitual is always definite while the subject of a generic is always indefinite: Generic: Dogs bark when the moon is full. Habitual: His dogs bark when the moon is full. Thus, as long as your AL allows you to make a definite/indefinite distinction without too much difficulty, you can use the same form for both habitual and generic. Besides, "genericness" is really a property of a noun - NOT of a verb. [Incidentally, English also allows a definite article to appear with an indefinite noun, which can be confusing, as in "The elephant lives in Africa". Here, the context must make it clear whether the speaker means a particular elephant or elephants in general. Fortunately, there is no need to allow this kind of construction in your AL.] 17.1 IMPLEMENTING TENSE AND ASPECT Tense seems to be morphologically or lexically linked to aspect in most natural languages, and I suggest that they be combined in an AL as well. In our sample language, I will accomplish this by allocating roots that are MNEMONICALLY compositional, just as we did for deictics. In other words, tense and aspect words will be formed from true, unique root morphemes, but we will design them in a way that will display their inherent compositionality. Here are the details: Tense Aspect ----- ------ Past: lu- Perfect: -- (default) Present: co- Imperfect: -nsa- Future: ti- Iterative: -mpo- Unspecified: ba- Habitual: -ntu- Inceptive: -spi- Continuative: -mbe- Terminative: -nzi- Completive: -ksu- Reserved: -ple- Reserved: -lto- Unspecified: -nda- Also, tense/aspect words take an entire clause as an argument, rather than just modifying the verb. Thus, they will require the terminator "-fo", which we introduced in the section on register variations. [Incidentally, note that the terminative aspect is actually equivalent to the negated continuative; i.e., an event that does not continue has stopped, and vice versa. Thus, the terminative could just as easily be formed from the continuative plus the negating morpheme "-na-". For example, "lunzifo" is actually an abbreviation for "lumbenafo".] Here are a few examples: John looked at the house. = past perfect = John lufo look at the house. John will be reading a book when I arrive. = future imperfect = John tinsafo read a book when I arrive. John is replacing the front tire. = present imperfect = John consafo replace the front tire. It is also possible to apply more than one of the more specific aspects at the same time, but the way this is implemented varies considerably from language to language. Some languages have simple and regular rules for doing so, while others must depend on context or periphrasis. English is an example of the latter. Consider the following three English sentences: 1. Kijani started singing the song 5 minutes ago. 2. Kijani started singing the song 5 years ago. 3. Kijani started singing the song 5 years ago and has never stopped, even to eat or sleep. In most circumstances, (1) would be interpreted as inceptive-imperfect. Example (2), though, would normally be interpreted as a combination of perfective, then habitual, then inceptive; i.e. inceptive-habitual-perfective (assuming Kijani is a human with normal human limitations). However, context makes it clear that (3) can only be interpreted as inceptive-imperfect, while also implying that Kijani is some kind of supernormal creature. [Note that I am using reverse order to indicate the sequence of aspects when more than one is being applied. Thus, inceptive-habitual-perfective has the nesting order inceptive(habitual(perfective(verb))). In other words, perfective is applied first to the unmarked verb, then habitual is applied to the perfective verb, and finally, inceptive is applied to the habitual- perfective verb.] When an aspect is applied to a verb or to a verb that has already been aspectually modified, there is often an assumption as to the initial aspect of what is being modified. For example, inceptive can only be applied after an imperfective. In cases such as this, there is no need to explicitly apply the imperfect aspect before applying the inceptive, since inceptive can NEVER follow a perfect aspect in a sequence. Thus, we can apply inceptive directly to the verb without first applying imperfective. For the sake of efficiency, it will be useful to define a default assumption for aspects that can apply to either perfective or imperfective. To this end, I suggest the following defaults: iterative-perfect habitual-perfect inceptive-imperfect (this is the only possibility) continuative-imperfect (this is the only possibility) terminative-imperfect (this is the only possibility) completive-perfect Thus, for example, if habitual-perfect is needed, there is no need to first mark the verb as perfective - only the habitual needs to be specified. Here are some examples that use the defaults: He keeps sneezing. = present iterative-perfect = He compofo sneeze. John started singing the song. = past inceptive-imperfect = John luspifo sing the song. John used to sing when I visited. = past habitual-perfect = John luntufo sing when I visited. Now, compare the last one above with: John used to be singing when I visited. = past habitual-imperfect = John luntufo bansafo sing when I visited. = John luntunsafo sing when I visited. [Since the default assumption for habitual is perfective, we must first apply imperfective "-nsa-" before applying habitual "-ntu-".] Note that when more than one aspect is applied to a verb using more than one word, tense is normally applied to the outermost aspect. When this occurs, the inner aspect(s) will be tenseless, as is normally the case with infinitives, participles, and other equivalent non-finite forms in natural languages. Here are some more examples that require more than one aspect: John was starting to sing the song when... = past imperfect-inceptive-imperfect = John lunsafo baspifo sing the song when... = John lunsaspifo sing the song when... [Note that since inceptive specifies a single point in time, the RESULTING aspect is perfective. By applying an imperfective to the inceptive, the start point is 'opened up' and forced to become a range of points.] John used to stop smoking as soon as I arrived. = past habitual-terminative-imperfect = John luntufo banzifo smoke as soon as I arrived. = John luntunzifo smoke as soon as I arrived. John started to (habitually) smoke when he was 15 years old. = past inceptive-habitual-perfect = John luspifo bantufo smoke when he was 15 years old. = John luspintufo smoke when he was 15 years old. 17.2 DEFAULT TENSE AND ASPECT In uninflected natural languages (e.g. Chinese, Cambodian, Indonesian), tense and aspect are usually not explicitly indicated if the intent of the speaker is clear from the context. Since our sample language is also uninflected, I suggest that we adopt a similar approach. However, I also suggest that we create a few simple rules that will make the intended tense and aspect obvious - even to a computer. Here are the rules that I feel are both natural and efficient: 1. The main verb of a sentence will be present-imperfective for "-s" verbs and past-perfective for "-d" and "-p" verbs. 2. Verbs in subordinate clauses will have the same tense as the main verb. Present tense verbs and/or "-s" verbs will be imperfective, while "-d" and "-p" verbs will be perfective. 3. If a verb is modified by a temporal or aspectual adverb or phrase, then the adverb or phrase will indicate the tense or aspect. Here are some examples for rule (1): He break the window. A/P-d main verb = He broke the window. He ask me a question. A/P/F-p main verb = He asked me a question. He know geometry. P/F-s main verb = He 'is knowing' geometry. = He knows geometry. John walk to school. AP-s main verb = John is walking to school (at this very moment). Here are some examples for rule (2): He will drive to town after she call. "to call" = A/P-p = He will drive to town after she calls. [Note that the embedded verb "calls" is actually future tense in meaning although English uses the present tense.] I did not complain when John ignore me. "to ignore" = AP/F-s = I did not complain when John was ignoring me. He knows that Mike lie. "to lie" = A/P-p = He knows that Mike is lying. He knew that Mike lie. = He knew that Mike lied. Here are some examples for rule (3): John buy a new bicycle yesterday. = John bought a new bicycle yesterday. He speak to his sister tomorrow. = He will speak to his sister tomorrow. He speak to his sister now. = He is speaking to his sister now. [The adverb "now" forces the verb to be present tense. And since present-perfect is semantically meaningless, it must be imperfective, even though the verb is "-p".] John teach math for three years. = John taught math for three years. = John 'was teaching' math for three years. [Note that "for three years" is a periphrastic way of marking the imperfect aspect.] John study math for three years. = John studied math for three years. = John 'was studying' math for three years. John sneeze three times. = John sneezed three times. [Note that "three times" implies iterative aspect.] I do not speak while John teach. = I do not speak while John is teaching. [Here, the main verb is present-habitual-imperfective. Since it is present tense, the subordinate verb must be imperfective. And even if it were not, the case marker "while" would force an imperfective interpretation of "teach", as in the next example...] I did not speak while John teach. = I did not speak while John was teaching. [Here, only the case marker "while" forces "teach" to be imperfective.] In the last example, if we had used an aspectless case marker, such as "when", then "teach" would have to be explicitly marked as imperfective to get the 'was teaching' interpretation. Otherwise, the default interpretation would imply that the speaking and teaching events were points in time that completely overlapped each other; i.e. "I did not speak at the same time that John taught" or "I was not speaking at the same time that John was teaching". Other temporal case markers, such as "after" and "before", imply a single cut- off point in time, and these, by default, will force a subordinate verb to be perfective. 17.3 FURTHER DERIVATION USING TENSE/ASPECT ROOTS The tense/aspect roots represent many useful concepts, and can undergo further derivation to produce many useful words. We can accomplish this by suffixing appropriate classifiers, CCMs, and MCMs. Before we can proceed, though, we need to define the semantics of the conversion process. In other words, since we will be using a tense/aspect root as a state root, we need to define the meaning of the resulting state. Here are the meanings that I feel are most productive: 1. If a root contains tense information, the corresponding state will represent a point or range of points on the time line. 2. If a root contains aspectual information, the corresponding state will represent a point in time (perfective), a range of points in time (imperfective), a repeated set of points in time (iterative), the starting point of a range of points in time (inceptive), etc. 3. The patient of a derivation is the entity or event that experiences the temporal/aspectual state. 4. If a derivation is focused, then the focus is a referent event. In other words, the patient experiences the temporal/aspectual state relative to the focal event. Thus, the patient can be either an entity OR an event. The focus, however, MUST be an event - it can never be an entity (unless, of course, the focal "entity" implies or is somehow associated with an actual event or time). In summary, the language designer can avoid mistakes in applying temporal and aspectual states by always conceptualizing them as points on the time line. Finally, since all tense/aspect words are inherently relational, the default class will be P/F-s. With the above in mind, we can now create the following useful words: -tinda- (future, unspecified aspect): tindasi - P/F-s verb = 'to be in the future relative to the focus' 'to occur after', 'to post-date' e.g. The accident tindasi the party. = The accident occurred after the party. [Note that a "-d" version of this verb is not very useful. It would mean something like 'to arrive at or enter a time in the future relative to the focus'. For example, using the "-d" version, the above sentence would mean 'The accident took place while arriving at a time after the party'. This would imply that the accident started during or even before the party. In other words, the accident moved along the time line from a point in time NOT after the party to a point in time AFTER the party.] tindape - P/F-s case tag = 'after' e.g. He left tindape I did. = He left after I did. tindano - P/F-s adjective = 'subsequent', 'following' [The open adjective form "tindabie" has the meaning 'subsequent to'.] tindaseno - P-s adjective = 'future', '-to-be' e.g. I just saw the tindaseno bride. = I just saw the bride-to-be [Note the difference in meaning between the P-s and the P/F-s forms.] tindasepe - P-s adverb = 'later', 'afterwards', 'someday', 'sometime in the future' e.g. I wash the dishes tindasepe. = I'll wash the dishes later. [Note that there is no need to mark the verb "wash" for future tense, since the future meaning is present in the adverb.] tindaseda - P-s noun = 'a future event', 'something which happens in the future' tindabeda - time noun = 'future time', 'the future' These should look familiar. In the section on temporal case tags, we used the root "lunda-" to represent the temporal relationship meaning 'before'. This root, as we can now see, is simply the tense/aspect root representing past tense with unspecified aspect. [Unspecified aspect is used because the actual aspect has already been provided by the main verb. To provide it again in the case tag would be redundant.] Similarly, the root "conda-" (present tense, unspecified aspect) can be used to create the P/F-s temporal case tag "condape" meaning temporal 'at', 'when', or 'at the time of'. Thus, there is no need to randomly allocate state roots to perform temporal functions. In fact, if we should ever discover that we are unable to "derive" an essential temporal case tag using the above approach, then it will imply that our tense/aspect system is incomplete. If this should occur, then an appropriate new entry should be added to the tense/aspect table. Now, let's create some more useful words: -lu- (past, unspecified aspect): lugada - P-s [+F] noun = 'precursor/forerunner' lugagiu - P-s [+F] open noun = 'precursor of/forerunner to' e.g. The telegraph was the lugagiu all modern communications. = The telegraph was the precursor of all modern communications. -baspi- (tenseless inceptive): baspisi - P/F-s verb = 'to be at the start/beginning point of (an event)' e.g. John baspisi a new adventure. = John is at the start of a new adventure. baspisuasi - AP/F-d verb = 'to cause oneself to enter the start/beginning point of an event', 'to start', 'to begin' e.g. John baspisuasi his homework. = John started his homework. The P-d verb "baspipiasi" also means 'to start/begin' but would be used when there is no agency, as in "The rain started at 3 PM". Thus, it's closer to the English word "commence". From "-baspi-", we can also derive the following: P-s adjective: baspiseno = 'initial' P-s adverb: baspisepe = 'initially', 'at the start', 'in the beginning' Now, here are some additional useful derivations: -bambe- (tenseless continuative): bambesi - P/F-s verb = 'to be at some point between start and end', 'to be during', 'to take place during' e.g. The aria bambesi the castle scene. = The aria takes place during the castle scene. bambepe - P/F-s case tag = 'during' e.g. John called his brother bambepe his lunch break. = John called his brother during his lunch break. bambesesi - P-s verb = 'to be still going on' e.g. The celebration bambesesi. = The celebration is still going on. bambelape - "0" adverb = 'in the meantime', 'meanwhile' bambenalape - "0" adverb = 'no longer', 'not any more' bambefisi - AP/F-s verb = 'to maintain oneself at some point between start and end', 'to be still doing', 'to be still at' e.g. John bambefisi his homework. = John is still doing his homework. bambesuasi - AP/F-d verb = 'to cause oneself to enter at some point between start and end', 'to continue doing', 'to go back to (doing)' e.g. John bambesuasi his homework. = John went back to/continued doing his homework. Here, the "-s"/"-d" distinction is definitely a useful one. The AP/F-s verb is equivalent to the English phrase "to be still doing". The AP/F-d verb, however, implies that the event continued after some kind of pause or stoppage. In effect, it means 'to re-enter the event', which is the way the English verb "to continue" is most commonly used. The terminative is also very useful: -banzi- (tenseless terminative): banzipusi - A/P-d = 'to stop', 'to bring to a halt' e.g. The police banzipusi the parade. = The police halted the parade. banzipiasi - P-d = 'to stop', 'to come to a stop' e.g. The rain banzipiasi. = The rain stopped. banzisuasi - AP/F-d = 'to exit an event', 'to stop doing' e.g. John banzisuasi his homework. = John stopped doing his homework. The P/F-s derivation would mean something like 'to be at the end of (an event)'. Now, let's speed things up and derive a large number of useful words. To save time, I'll just list the results (with an occasional comment here and there) and leave the details to the interested reader: -banda- (tenseless, unspecified aspect): P-s verb: bandasesi = 'to exist in time', 'to occur', 'to happen', 'to take place' -consa- (present tense, imperfective): P-s adjective: consaseno = 'in progress', 'happening', 'currently ongoing' P-d adjective: consapiano = 'unfolding' (as in "the unfolding story") -bansa- (tenseless, imperfective): P-s adjective: bansaseno = 'ongoing', 'having duration' P-s verb: bansasesi = 'to take time', 'to go on' It's important to note that we cannot derive a word meaning 'duration' from a tense/aspect root. The tense/aspect roots indicate the temporal 'location' and the temporal 'shape' of an event, but not the actual length of the event. Thus, a word with the meaning 'duration' must be derived from a measure verb, which we discussed earlier. For the same reasons, a word meaning 'to last' or 'to take', as in "The job took three hours" must be derived from the same measure word. Now, let's continue with some more useful derivations: -bampo- (tenseless, iterative): A/P-s verb: bampozoyasi = 'to iterate', 'to do (something) repeatedly' P-s adverb: bamposepe = 'repeatedly', 'over and over' P-s process: bamposepada = 'iteration/repetition' [Do not confuse aspectual iteration with numeric repetition (which uses specific numeric morphemes). Aspectual iteration implies a single event with several sub-events. Repetition implies two or more distinct events. Thus, the verb meaning 'to repeat' is simply "zefigefesi", which literally means 'to do one more time' or 'to re-do'.] -bantu- (tenseless, habitual): P-s adjective: bantuseno = 'habitual', 'regular' P-s adverb: bantusepe = 'habitually', 'regularly' P-s process: bantusepada = 'habit', 'wont', 'custom' -baksu- (tenseless, completive): AP/F-d verb: baksusuasi = 'to complete', 'to finish' AP/F-d process: baksusuapada = 'completion' AP/F-d result: baksusuamanteda = 'final result' P-s adjective: baksuseno = 'complete' P-s adverb: baksusepe = 'completely', 'to completion' Languages also need to be able to express degrees of nearness and remoteness in time. For this purpose, I suggest that we again use the scalar polarity MCMs. Here they are again for easy reference: -pi- 'maximally', 'extremely' -ge- 'very', 'highly' -so- 'not too', 'not very' -ju- 'minimally', 'barely', 'hardly' The CCM "-na-" meaning 'not' will also be useful. Here are some useful derivations: banzisepe = 'at the end', 'finally' (tenseless terminative) banzipisepe = 'at the very end', 'right at the end' banzigesepe = 'near the end' banzisosepe = 'well before the end' banzijusepe = 'way, way before the end' tindasepe = 'later', 'afterwards' (future, unspecified aspect) tindapisepe = 'in the remote future' tindagesepe = 'in the distant future' tindasosepe = 'soon', 'in the near future' tindajusepe = 'immediately' tindanasepe = 'at the same time or earlier' lundasepe = 'earlier', 'in the past', 'already' (past, unspecified aspect) lundapisepe = 'in the remote past' lundagesepe = 'in the distant past', 'a long time ago' lundasosepe = 'recently', 'in the recent past' lundajusepe = 'just' lundanasepe = 'at the same time or later' baksusepe = 'completely', 'thoroughly', 'to completion' (tenseless, completive) baksupisepe = 'really thorough', 'absolutely to completion' baksugesepe = 'almost', 'come close to', 'not quite' baksusosepe = 'hardly at all', 'barely at all' baksujusepe = 'not at all' baksunasepe = 'incompletely' The above examples are all adverbs, but other versions, especially the adjectives and nouns, will also be useful. Actually, though, the adverbs will probably not get much use, since the MCMs can be applied directly to the tense/ aspect words. We can also apply the numeric MCMs meaning 'always', 'often', etc. Here are some examples: lunzifo = past terminative lunzijufo = 'just stopped' e.g. "He lunzijufo run" = "He just stopped running" lufo = past perfective lujufo = 'just' e.g. "He lujufo arrive" = "He just arrived" tifo = future perfective tijufo = 'to be about to' e.g. "He tijufo leave" = "He's about to leave" baksufo = tenseless completive baksugefo = 'almost' e.g. "He baksugefo fail" = "He almost failed" lunsafo = past imperfective lunsasaksifo = 'was always ...ing' e.g. "He lunsasaksifo leave early" = "He was always leaving early" And so on. Note that we can also use polarity MCMs for LOCATIVE concepts. For example, the case case tag "mepe", meaning 'at/in', can undergo further derivation to produce "mepipe" = 'right at', "megepe" = 'almost at/near', "mesope" = 'not very close to', and "mejupe" = 'not at all close to'. The negative form "menape" meaning 'not at' or 'away from' can undergo further derivation to produce "menapipe" = 'very far from', "menagepe" = 'far from', "menasope" = 'not too far from', and "menajupe" = 'not far at all from'. Finally, when "-fo" is applied to a tense/aspect root, it is very similar to the combination "ma" + "xi" + "si", where "-ma-" is the P/F-s verb classifier and "-xi-" is the anti-middle CCM. The result is a P-s [-F] verb, which takes an embedded sentence as its only argument. For example, "John tifo leave" meaning 'John will leave' is very similar to "Timaxisi John leave" meaning 'At some time in the future relative to an unspecified focus, John will leave'. However, in main clauses, where tense is normally deictic, the unspecified focus is normally the time of the utterance. When using "-maxisi", the unspecified focus does not necessarily have to be the time of the utterance. Note how this strong deictic effect is also present when we use "-fo" with register morphemes. For register acts, however, the focus is the embedded sentence. The deictic elements are the agent, who is always the speaker, and the patient, who is always the listener. 18.0 MODALITY Whenever we speak, we always provide some indication of our commitment or attitude towards what we are saying. These attitudes are always subjective and range in concept from certainty to uncertainty, from insistence to prohibition, from encouraging to warning, and so on. This subjective judgement of a speaker towards what he is saying is called the _modality_ of an utterance, and can vary in kind as well as in degree. Here are some English examples: You must go now. -> 100% obligation You should go now. -> high obligation You need to go now. -> high necessity He left. -> 100% probability He may have left. -> undefined probability He might have left. -> low probability He did not leave. -> zero probability Did he leave? -> interrogative probability He should be there. -> high probability Does it matter that he won? -> interrogative importance It seems the storm is over. -> high evidentiality [Evidentiality indicates the speaker's judgement about how reliable the information is.] As you can see from the above examples, there is very little regularity in the English modal system, and this is typical of perhaps all natural languages. Modal systems evolve slowly over time and can be quite idiosyncratic. In a single language, some modals may take the form of inflections, some may use auxiliaries, while some may use verbs, adverbs, or other open class words. In this respect English is typical. Unfortunately, different languages implement modal concepts in different ways, and a particular modal may be used for more than one type of modality or may cover different degrees. For example, the English modal "should" can express either probability, obligation, or evidentiality. There may also be different ways of expressing the same type and degree of modality. For example, the English expressions "should" and "ought to" are essentially synonymous, as are "must/have to", "does it matter/is it important", and so on. Finally, modalities often overlap in meaning. For example, both "must" and "have to" can imply either 'obligation', 'necessity', 'probability', or 'evidentiality'. In fact, the modal systems of natural languages vary SO much and are SO idiosyncratic, that a truly neutral and regular system is unlikely to resemble the system of ANY natural language. Fortunately, the semantics of modality is highly regular, and CAN be categorized. 18.1 MODAL CONCEPTS The most basic modal concept is 'probability'. It is the most basic because it provides us with the most common sentential types: positive statements, negative statements, and interrogative statements. And, as we will see later, it is also often used in conjunction with other types of modality. Here is a breakdown of the probability modality: probability: He left yesterday. 100% probable He must have left yesterday. high He may have left yesterday. undefined He might have left yesterday. low He did not leave yesterday. 0% Did he leave yesterday? interrogative The 100% probability modality is normally referred to as the _indicative_, the 0% probability modality is referred to as the _negative_. Also, the 100% probability modality is normally unmarked. When it is explicitly marked, it is called the _emphatic_. (Cf. "He left yesterday" vs. "He did leave yesterday" or "He definitely left yesterday".) There are also several other modalities. However, in most natural languages, these modalities generally only have unique modal forms for the 100% or high degree, if at all. Other degrees of modality are generally obtained by use of adverbs, normal verbs, disjuncts, and other kinds of periphrasis. Here is a listing of a few of the other modalities, illustrating the 100% and the high degrees of each one in English: obligation: He must go now. 100% He should (= ought to) go now. high necessity: It is essential that he go now. 100% He needs to go now. high evidentiality: It's obvious that he left. 100% He seems to have left. high inevitability: He must be there by now. 100% He's bound to be there by now. high The other degrees of modality (and occasionally the 100% and high degrees, as well) are often quite idiosyncratic, and may require adverbs, normal verbs, and unusual language-specific forms of prodody and/or periphrasis. Some linguists consider certain feelings about an event to be modal in nature. Here are some examples: fear: I fear that he left. gladness: I'm glad that our team won. sorrow: It's sad that he flunked the course. curiosity: It's curious that he left so early. However, these are not true modals because the embedded event CAUSES the state of the speaker. For a true modal, the speaker is judging a situation and must be the source of the judgement. Besides, these feelings are inherently mental, and represent the state of the speaker himself. They do NOT represent the speaker's judgement of an event. Thus, they should be derived from basic state verbs. [Note though, that while the English word "curious" proto- typically represents a mental state, the related word "odd" is a true modal.] Some people may also be tempted to include other attitudes, such as disgust, fondness, hatred, suspicion, etc. among the modals. However, these again do not indicate the speaker's judgement about what he is saying. In fact, they represent the speaker's feelings towards the listener or a third party. Since all modalities express the speaker's judgement towards what he is saying, all modalities are, in effect, a kind of speech act, and it should not be surprising that modalities that do not have formal expression in a particular language are often implemented using speech act verbs (e.g. the English hortative "to urge"). In fact, all true modals can be paraphrased as something like "I say that there is X degree of modality Y that Z". For example, the sentence "You need to find a job" can be paraphrased as "I say that there is a high degree of necessity that you find a job". And like all speech acts, the 'agent' (i.e. the speaker) attempts to cause a change of state in the 'patient' (i.e. the listener), either by affecting the behavior of the patient or by imparting information to the patient. In other words, the speech act either tries to convince the listener to do or to not do something, or it tries to get the listener to accept, question, reject, or supply information. It's important to keep this in mind if you should ever feel that other concepts may be inherently modal in nature. [Later, we'll discuss a rigorous and comprehensive test for modal concepts.] 18.2 THE SEMANTICS OF MODALITY All modalities belong to one of two categories: 1. Epistemic: a judgement about a real situation (e.g. "John may have gone away.") 2. Deontic: a judgement about a potential or hypothetical situation (e.g. "John should go away.") In additon, there is special type of deontic modality that includes an imperative from the speaker that the hypothetical situation should or should not be brought about. We will refer to this special modality as "speaker- oriented". The best example is a simple command, such as "Go away!". As we saw above with epistemic probability, each modal concept can take on a range of values. Here are complete examples for epistemic probability, deontic obligation, and speaker-oriented obligation: Epistemic probability: 100%: John left. high: John must have left. low: John might have left. very low: John just might have left. 0%: John did not leave. undefined: John may have left. interrogative: Did John leave? Deontic obligation: 100%: John must leave. high: John should leave. low: John should not leave. very low: John really shouldn't leave. 0%: John must not leave. undefined: John can/may leave. interrogative: Should John leave? Speaker-oriented obligation: 100%: Leave! or You must leave! high: I'd leave if I were you. low: I wouldn't leave if I were you. 0%: Don't leave! or You must not leave! undefined: You can/may leave. The 100% deontic obligation modality is sometimes called the _obligative_, and the 100% speaker-oriented obligation modality is called either the _imperative_ or the _hortative_. The 0% speaker-oriented modality is often called the _prohibitive_. The undefined deontic and speaker-oriented modals are closest to what linguists call the _permissive_, since they provide an option. The 100% and high versions of deontic modalities imply that the hypothetical event can, should, or will occur. The low and 0% versions imply that the event can not, should not, or will not occur. Thus, 0% speaker-oriented obligation modalities are equivalent to forbidding something or demanding that something NOT be done. This is a very important distinction and should always be kept in mind. The degree applied to an epistemic modality is the degree of the modality itself. The degree applied to a deontic modality indicates the degree to which the hypothetical event is eventually realized; i.e., the degree of EPISTEMIC PROBABILITY that can, should, or will apply to the hypothetical event. We'll see many examples of this later. Using the same logic, the undefined deontic is used to indicate that change is optional, and the interrogative deontic is used to ask if change is desirable. There are several other modalities. Here's another epistemic one: Epistemic evidentiality: 100%: It's obvious/clear/evident that John left. high: John seems to have left. low: There's little reason to believe that John left. or John couldn't have left. very low: There's almost no reason to believe that John left. 0%: There's no reason to believe that John left. or John couldn't possibly have left. undefined: There may or may not be reason to believe that John left. interrogative: Is there reason to believe that John left? or Could John have left? Thus, evidentiality indicates what APPEARS to be true - not what actually IS true. In effect, it simply comments on how reliable the speaker feels the information is. Some languages provide even greater detail, such as whether the speaker saw the event with his own eyes or heard it with his own ears. However, these more specific modalities are relatively rare. Here's another example of a deontic modality: Deontic inevitability: 100%: He can't help being there by now. or He WILL be there by now. high: He's bound to be there by now. low: It's hard to imagine him being there by now. very low: I just can't imagine him being there by now. 0%: He can't possibly be there by now. undefined: He could be there by now, but who knows? (The implication is that he's unpredictable.) interrogative: Is it possible that he's already there? (Here, "possible" has a sense closer to "predictable".) 100% inevitability is unlike 100% obligation, since it implies that something WILL happen in spite of any attempts to stop it. Here are a few other modalities: Epistemic adequacy/sufficiency: What he's doing is adequate/sufficient/satisfactory. Deontic necessity: He needs to take care of them. Epistemic significance: Does it matter that John won? Yes, it matters. It's significant that he left early. Deontic importance: He'd better keep his commitment. It's important for him to keep his commitment. He'd better not leave early. [This modality implies that a situation will have important consequences. High degrees imply positive consequences, while low degrees imply negative consequences.] Note that only deontic obligation has a speaker-oriented version. Later, we'll discuss other potential modalities. We'll also discuss how to test new concepts to determine if they are inherently modal in nature. 18.3 IMPLEMENTING MODALITY So, how should we implement modality in a way that captures its inherent regularity, while avoiding the ubiquitous variability and idiosyncracy of natural languages? There are three characteristics of modality that we need to represent: 1. The modal concept (e.g. probability, evidentiality, etc.) 2. The degree of modality (e.g. 100%, high, interrogative, etc.) 3. The type of modality (i.e. epistemic, deontic, or speaker-oriented) However, there is no need to indicate whether the type of a modal is epistemic or deontic, because the type is an inherent part of the modal concept. [If there were some way to derive one from the other, then we WOULD want to mark the type. However, I have not been able to do this, even though I spent a considerable amount of time trying. For example, what is the deontic counterpart of epistemic necessity? What is the epistemic counterpart of deontic obligation? Although I did once think that there was a correlation, I was never able to state the correlation with semantic precision, and so I abandoned the idea. I'll have more to say about this later.] So, in order to implement modality in the sample language, I will allocate a set of MNEMONICALLY compositional roots, just as I did for deictics and tense/ aspect roots. In other words, modal words will be formed from true, unique root morphemes, but we will design them in a way that will display their inherent compositionality. Here are the details: Degree Modality ------------------------------------------------------------------- 100% pi- Probability (epistemic) -nte- High ge- Evidentiality (epistemic) -sna- Low so- Adequacy (epistemic) -ngo- Very low ju- Significance (epistemic) -mbe- 0% na- Obligation (deontic) -ndu- Undefined xe- Inevitability (deontic) -sko- Interrogative ku- Necessity (deontic) -tsi- Importance (deontic) -spu- Speaker-oriented obligation: -nka- Note that only obligation has speaker-oriented morphemes. They will be used to derive all basic imperatives. Note also that the degree markers are intentionally chosen for their mnemonic value, since they are identical to the scalar polarity morphemes "-pi-", "-ge-", and so on. However, keep in mind that the modal degree markers are not true morphemes, but are only PART of complete morphemes. For example, the "pi" in "pinte" is not a true morpheme - it is simply the first syllable of the morpheme "pinte". Finally, since a modal takes an entire clause as an argument, we will use the terminator "-fo" to indicate the part-of-speech, just as we did for sentential register and tense/aspect words. Now, in all natural languages that I am familiar with, the indicative is the default and is unmarked. Thus, it might seem that the 100% epistemic probability marker "pintefo" is not really needed. However, a language must have a way of emphasizing the truth of a statement, and "pintefo" is the obvious and natural choice for this function. (Cf. "He went to the house" vs. "He DID go to the house" or "He definitely went to the house".) Here are some examples using English word order: Louise DID buy it. = Louise pintefo buy it. Louise didn't buy it. = Louise nantefo buy it. Did Louise buy it? = Kuntefo Louise buy it? Louise must have bought it. = Louise gentefo buy it. Louise may have bought it. = Louise xentefo buy it. Louise just may have bought it. = Louise sontefo buy it. She has to leave now. = She pindufo leave now. She should leave now. = She gendufo leave now. He needs to study harder. = He getsifo study harder. He's bound to cause trouble. = He geskofo cause trouble. He seems to have left. = He gesnafo leave. He'd better leave now. = He pispufo leave now. Now, let's see what happens when we apply two or more modals to the same sentence. Consider the following: Does she need to leave now? ? Kutsifo she leave now? (kutsifo = interrogative necessity, ? Kuntefo getsifo she leave now? kuntefo = interrogative probability, ? Getsifo kuntefo she leave now? getsifo = high necessity) The third example doesn't really make any sense ("It is necessary that 'is it true that she is leaving now?'"). The second example seems to have the desired meaning ("Is it true that she needs to leave now?"). But how do we interpret the first example? We might be tempted to paraphrase an interrogative degree of modality as "What is the degree of modality Y?". However, this is incorrect. When we ask a question, we are not asking for a probability - we are asking for a response that CONTAINS a probability. For example, if I ask "Is John home?", I am hoping for a statement such as "No, he isn't". I would be very surprised with an answer such as "0% probability". In other words, a typical valid answer would be one of the NON-interrogative probabilities. The same reasoning applies to the other modalities. When we use a modality of interrogative degree, we are asking a question that can best be answered by an answer that contains a non-interrogative modality of the same type. Thus, when I ask a question using "kuntefo", I am hoping for an answer that contains either "pintefo", "gentefo", "sontefo", "nantefo", or "xentefo". Now, let's look again at the first and second examples: Does she need to leave now? ? Kutsifo she leave now? (kutsifo = interrogative necessity, ? Kuntefo getsifo she leave now? kuntefo = interrogative probability, getsifo = high necessity) The second example is actually asking "Does she have a high need to leave now?". This is not quite the same as asking "Does she need to leave now?" A good paraphrase would be "Does she REALLY need to leave now?". Thus, a possible answer to the second example could be something like "No, she doesn't NEED to leave, but she probably should anyway". The first example, however, is completely general, and is thus closest in meaning to the English sentence "Does she need to leave now?". A good paraphrase of the first example would be "Is there a need for her to leave now?". In other words, a question using an interrogative modal is actually more general than combining the interrogative probability marker "kuntefo" with a specific marker of the desired modality. When "kuntefo" is followed by another modal, we are really asking if that SPECIFIC degree of modality applies. It is important to keep this in mind when designing and describing the modality system of an AL. 18.4 MORE ON SPEAKER-ORIENTED PROBABILITY Speaker-oriented obligation is like deontic obligation in that it indicates that something should be done. However, it goes beyond deontic obligation by actually urging the listener to do something. Thus, a good English paraphrase of 100% speaker-oriented obligation would be something like "See to to that..." or "Make sure that...". Here's a complete list: 100% pinkafo See to it that... Make sure that... high genkafo You should see to it that... You should make sure that... low sonkafo You should not let... very low junkafo You really shouldn't let... 0% nankafo Don't let... Undefined xenkafo You might want to have ... Interrogative kunkafo Should you have... ? Note that the above interpretations allow a complete sentence as an argument, and can thus demand action from a third party. For example, the sentence: "Pinkafo Billy go to bed now" <- 100% speaker-oriented probability could mean any of the following: See to it that Billy goes to bed now. Have Billy go to bed now. Make sure that Billy goes to bed now. In addition to the above, all languages seem to have a much more narrowly defined 100% imperative which is very abrupt and which applies directly and only to the listener. For example, the English command "Open the door!" is a more abrupt version of "See to it that you open the door". Since this more abrupt imperative is both useful and (apparently) universal, we will also implement it in the sample language. Specifically, the special part- of-speech terminator "-cu" will be a short form for the normal verb terminator "-si" plus the 2nd person deictic "dustuda" meaning 'you' as the implied subject. The following examples show how "-cu" is used: Pinkafo dustuda teyokosi her to swim = See to it that you teach her to swim. Teyokocu her to swim! = Teach her to swim! Note that a true imperative is always directed at the listener, even if the speaker is demanding action by a third party. It is the listener that is being given responsibility for the action. Finally, some languages that have a distinct morphology for imperatives can also apply them directly to first and third persons. The ones that I am familiar with generally have the meaning 'let ...' as in "Let them leave if they really want to". However, these are not true imperatives. They are either permissives (i.e. undefined deontic obligation "xendufo"), verbs meaning 'permit/allow' (derived in the next section), or disjuncts expressing a sense of frustration or resignation. (We discussed disjuncts earlier in the section on grammatical voice, and will have even more to say about them later in the section on conjunctions.) 18.5 FURTHER DERIVATION USING MODAL ROOTS As was the case with tense/aspect roots, modal roots can be used to derive many additional words. Before we can start, though, we need to define the equivalent 'state' of a modal. In other words, what is the basic or raw state that is associated with a modal? As I mentioned earlier, all of the modal derivations are essentially speech acts, since the speaker tries to induce a change of state in the listener using speech. However, unlike a true speech act, a modal ALWAYS describes the speaker's judgement of a situation that may be completely unrelated to the speech act itself. In other words, a modal is a combination of a speech act, an additional situation, plus the speaker's judgement of the additional situation. Thus, a modal concept is much more complex than most basic states, and this overly complex concept will not be useful if it undergoes further derivation. Fortunately, the most useful component of a modality is how the speaker judges the situation. If we can isolate this attitude, it will provide us with a simpler concept that we can then use very productively in further derivations. Thus, we need a strategy that will eliminate the speaker's contribution such that only the basic modality remains. To that end, I suggest that we paraphrase the modal in such a way that it eliminates the 'speech act' component and isolates the modal concept. We can do this by using each modal in a test sentence and paraphrasing it in the form "it is X that ...", where X is the modal concept. Here are several examples: Epistemic probability: 100%: He took care of the children. It is true that he took care of them. high: He must have taken care of them. It is probable/likely that he took care of them. low: He just may have taken care of them. It is unlikely/improbable that he took care of them. very low: He almost certainly did not take care of them. It's implausible/hardly possible that he took care of them. 0% He did not take care of them. It is false/impossible that he took care of them. undefined: He may have taken care of them. It is possible that he took care of them. Deontic obligation: 100%: He must take care of them. It is mandatory/obligatory that he take care of them. high: He should take care of them. It is desirable that he take care of them. low: He should not take care of them. It is undesirable that he take care of them. 0%: He must not take care of them. It is forbidden/prohibited that he take care of them. undefined: He can/may take care of them. It is optional that he take care of them. Epistemic evidentiality: 100%: It's obvious/evident/apparent that he took care of them. (same) high: It seems/appears that he took care of them. It is highly apparent that he took care of them. Deontic inevitability: 100%: He can't help taking care of them. It is inevitable/fated/preordained that he take care of them. High: He's bound to take care of them. It is 'meant to be' that he take care of them. Epistemic sufficiency: 100%: It's sufficient/adequate that he took care of them. (same) 0%: It's insufficient/inadequate that he took care of them. (same) Deontic necessity: 100%: It's essential that he take care of them. (same) high: He needs to take care of them. It is necessary that he take care of them. Epistemic significance: 100%: It's highly significant/momentous that he left early. (same) high: It matters that he left early. It is significant that he left early. Deontic importance: 100%: He'd better leave early. It is essential that he leave early. high: It is essential/important that he leave early. (same) [Incidentally, I am very reluctant to use passive glosses like "forbidden", "pre-ordained", "desirable", and so on, since they implicitly involve agents or patients and are not true abstract concepts. Unfortunately, I have not been able to think of better glosses, and English may not HAVE any. When an appropriate word is not available, natural languages often extend the meaning of existing words using polysemy or metaphor.] Note that none of the above are true states. If they were, they would describe the states of ENTITIES. Instead, they describe the mental judgement of the speaker about a SITUATION. Thus, the situation is actually the focus of the speaker's mental state. For example, if a situation is 'obvious', then the speaker FEELS that it is obvious. If a situation is 'adequate' then the speaker FEELS that it is adequate. And so on. In other words, the true states can be best captured in the form of P/F-s verbs, since they indicate a relationship between a patient/entity and a focus/situation. The raw concepts themselves (i.e. 'true', 'obvious', 'adequate', etc.) can be represented by the F-s [-P] middle voice forms (CCM "-de-"). Note also that all of the epistemic paraphrases use the past tense, while all of the deontic paraphrases use an implicit future tense. I discovered that using this convention is less likely to result in confusion (at least when the paraphrases are in English). Now, in order to convert the modal concept to a state, we must add a patient. And since modal concepts represent judgements, opinions, and attitudes, modal states will always be mental states. Thus, we can best capture the mental state by using a P/F-s state verb of the form "Patient feels/thinks that focus". Here are some epistemic examples: probability: I feel that F is true = I believe F I feel that F is unlikely/improbable = I doubt F evidentiality: I feel that F is obvious = I am sure/positive that F sufficiency: I feel that F is adequate = I am satisfied with/that F Note that each modal concept has now become an actual mental state of a patient. Now, let's see if we can do the same thing with a few deontic modalities: obligation: I feel that something is mandatory = I ??? necessity: I feel that something is necessary = I ??? inevitability: I feel that something is inevitable = I ??? What's wrong? It seems that deontic modalities do not really describe the state of the speaker. Instead, the speaker is actually describing the state of someone or something else. Thus, it's necessary to paraphrase deontic derivations in terms of the other entity, as follows: obligation: Something is mandatory to the patient = The patient is obligated to... necessity: Something is necessary to the patient = The patient needs/has a need for... inevitability: Something is inevitable for the patient = The patient is fated/destined to... Thus, for epistemic modalities, we must paraphrase the state as the equivalent state of the speaker. For deontic modalities, we must paraphrase the state as the equivalent state of the entity that the speaker is talking about. Also, it's essential to keep in mind that the degree of an epistemic modality is the degree of the modality itself. However, the degree of a deontic modality indicates the likelihood of the hypothetical event. For example, 0% probability means that the probability is 0%. However, 0% obligation does NOT mean that there is no obligation. Instead, it means that the event must not occur; i.e., that there is an obligation to see that the event does not occur. Thus, 0% obligation is equivalent to prohibition. If we need to specify that there is a deontic option, we use the undefined degree. Finally, speaker-oriented obligation is simply an imperative version of deontic obligation. Thus, it can be used to implement verbs that are inherently imperative (or prohibitive) in nature. I'll provide examples of these below. Now, with all of the above in mind, let's create several useful words from modal roots. For all modal roots, the default class will be P/F-s. Here are some of the many possible derivations from the epistemic probability modality: 100% epistemic probability, "-pinte-": P/F-s pintesi = to believe, to accept as true/correct/right pintenuda = belief, something believed to be true ("-nu-" = passive CCM) pintepada = faith ("-pa-" = process CCM) pintedesi = to be true/correct/right ("-de-" = middle CCM) pintedeno = true/correct/right pintededa = the truth, what is true Pintevoisi = yes, that's correct/right, it's true, etc. (used in answer to a question, "-voi-" = double middle CCM) [English speakers should be careful not to extend the meaning of these derivations to people. For example, in the sentence "You are correct", the speaker really means 'What you are saying is correct'. Thus, it's acceptable to say "THAT is correct", but not "YOU are correct". In other words, 'truth' or 'correctness' applies to a situation - NOT to a person.] P/F-d pintedosi = to swallow, to fall for P-s pinteseno = credulous AP/F-s pintefisi = tp presume/assume, to be certain/sure that pintefinesi = to agree that F ("-ne-" = cosubject CCM) pintefisi F nepe X = to agree with X that/about F AP/F-d pintesuasi = to decide, to conclude A/P/F-p pinteniosi = to impart, to convey, to make known, to disclose A/P/F-d pintekosi = to convince, to show, to demonstrate, to make clear [Note that the AP/F version implies an intentional belief; i.e. that the believer "takes" something to be true rather than "thinks/feels" something to be true.] high epistemic probability, "-gente-": P/F-s gentesi = to suspect, to feel that, to accept as likely/probable, to be of the opinion that, to think that gentedeno = likely/probable gentemanteda = opinion/feeling ("-mante" = process result or product CCM) gentevoisi = probably, in all likelyhood (in answer to a question) AP/F-s gentefisi = to guess, to hypothesize low epistemic probability, "-sonte-": P/F-s sontesi = to be doubtful, to accept as unlikely/ improbable sonteno = doubting/doubtful (i.e. a person) sontedeno = unlikely/improbable sontevoisi = probably not, not likely (in answer to a question) AP/F-s sontefisi = to be skeptical about, to consider unlikely/improbable sontefino = skeptical sontefideno = dubious/doubtful (i.e. a fact) 0% epistemic probability, "-nante-": P/F-s nantesi = to accept as false nantedeno = false/incorrect/wrong nantededa = mistake, something wrong or incorrect Nantevoisi = no, that's wrong/incorrect, it's not true (used in answer to a question) AP/F-s nantefisi = to disbelieve, to reject nantesausi = to disagree that F ("-sau-" = non-subject CCM) nantefisi F saupe X = to disagree with X that F undefined epistemic probability, "-xente-": P/F-s xentesi = to daresay, to accept as possible xentedeno = possible xenteveda = possibility ("-ve-" = quality CCM) xentevoisi = maybe/perhaps (in answer to a question) "0" xentelape = maybe/perhaps/possibly (adverb) interrogative epistemic probability, "-kunte-": P/F-s kuntesi = to be unsure about kunteno = uncertain, unsure (e.g. people uncertain of themselves) kuntedeno = uncertain, unsure (e.g. uncertain outcome) Kuntevoisi? = Is that right/correct/true/so? AP/F-s kuntefisi = to wonder, to question oneself about A/P/F-d kuntekosi = to raise doubts in (someone) about A/P/F-s kuntetuesi = to confuse (someone) about A/P-d kuntepusi = to confound, to befuddle Someone once said that all truth is relative. The above derivations certainly seem to reflect this attitude, since they imply that the truth of a situation is more perceived than real; i.e. it is true only if it is true to a patient. However, keep in mind that the MIDDLE voice implies nothing about the nature of the unmentionable perceiver. It could just as well be the universe, your cat, or a supreme being. In spite of this, it is important to remember that 'truth' as derived above does NOT mean 'absolute truth' or 'reality'. Thus, we cannot use the modal to derive concepts such as 'to exist = to be real' or 'to create = to make real'. The modals do not imply reality - only the perception of reality. Another difference is that the 'truth' described here is inherently scalar, while the concept we derived earlier meaning 'real' (state root = "-veya-") is inherently binary. Although I listed a large number of useful derivations for the epistemic probability modality, I'm sure that there are many more. The modal concepts are so basic, that it shouldn't be surprising that they can be the source of so many useful words. However, for the sake of brevity, I will only list a few derivations for the remaining modalities: 100% deontic obligation (= obligation), "-pindu-": P/F-s pindusi = it is mandatory/compulsory/obligatory for P to F pindudeno = mandatory, compulsory, obligatory pindudeda = duty, obligation AP/F-s pindufisi = to feel obliged/obligated to... AP/F-d pindusuasi = to take on the obligation to... A/P/F-s pindutuesi = to obligate, to give an obligation to someone else, to require A/P/F-p pinduniosi = to charge (with an obligation) A/P/F-d pindukosi = to have (as in "I had John wash the dishes") high deontic obligation, "-gendu-": P/F-s gendusi = it is desirable for P to F gendudeno = desirable low deontic obligation, "-sondu-": P/F-s sondusi = it is undesirable for P to F sondudeno = undesirable 0% deontic obligation, "-nandu-": P/F-s nandusi = it is impossible/proscribed for P to F nandudeno = impossible/proscribed A/P/F-s nandutuesi = to prohibit undefined deontic obligation, "-xendu-": P/F-s xendusi = it is optional for P to F xendudeno = optional xenduvusi = to be optional for [The inverse form "xenduvusi" would be used for English sentences such as "Picking up the litter is optional for the guests".] A/P/F-s xendutuesi = to allow, to permit Interrogative deontic obligation, "-kundu-": [This modality will indicate that the patient is uncertain of the desirability of something.] P/F-s kundusi = to be unsure if F should be done, to be unsure if F is desirable (eg. I kundusi John buy a new car = I'm not sure if John should buy a new car.) 100% speaker-oriented obligation (= hortativity), "-pinka-": [Note that hortativity is a true speaker-oriented modality. Thus, we can derive a true, imperative verb from it.] A/P/F-p pinkaniosi = to demand that, to insist that, to order, to command pinkanioxemnasi = to beg ("-xemna-" is the groveling register MCM) pinkaniotenkosi = to beseech/implore/entreat ("-tenko-" is the humble register MCM) The non-agentive derivations are also useful: P/F-s pinkasi = it is imperative for P to F pinkadeno = imperative high speaker-oriented obligation, "-genka-": A/P/F-p genkaniosi = to encourage/urge low speaker-oriented obligation, "-sonka-": A/P/F-p sonkaniosi = to discourage 0% speaker-oriented obligation (= prohibitive), "-nanka-": A/P/F-p nankaniosi = to forbid, to deny permission undefined speaker-oriented obligation, "-xenka-": A/P/F-p xenkaniosi = to give permission It is also possible to add a 'speech' morpheme to a modal. This morpheme is the one we discussed earlier when we derived the verb meaning 'to apologize'. However, this morpheme simply indicates that speech is being used. It does NOT create a true imperative as in the above example, "sicaipinio", which was formed from a speaker-oriented modal. In the sample language, I will use the MCM "-ja-" to indicate that speech is being used. Here are some examples: A/P/F-p pinteniosi = to impart, to convey, to make known, to disclose pinteniojasi = to claim, to affirm, to assert, to say that something is true/correct/right A/P/F-d pintekosi = to convince, to show, to demonstrate, to make clear pintekojasi = to persuade, to talk into AP/F-s nantefisi = to disbelieve, to reject nantefijasi = to deny, to repudiate Keep in mind that these derivations simply imply that speech is being used - they are NOT imperatives! The speech morpheme "-ja-" will probably not be very useful for many states and actions, since verbs that are normally classified as speech acts are often used in non-speech situations, and our "-p" verbs capture both speech and non- speech senses very nicely. True speech acts (e.g. 'to screech' and 'to curse') have the speech component as part of their meaning and do not require a separate morpheme. There may be times, however, when it is necessary to use a "-p" verb in order to clearly indicate or emphasize that speech is being used, and so the speech morpheme will have some use for normal state and action verbs (e.g. the verb meaning 'to apologize'). In general, though, this morpheme should only be used to clearly indicate that speech is being used. Finally, the most obvious and generic application of "-ja-" is the basic speech act verb "jasi = janiosi" meaning 'to say/tell/speak', where its default class will be A/P/F-p. It will also be useful if other speech act morphemes start with "ja" as a mnemonic aid. For example, morphemes such as "jambu", "jaya", "jaste", "japliwa", etc. could be used for other speech acts. Now, let's look at a few derivations from other modalities: 100% epistemic sufficiency, "-pingo-": P/F-s pingosi = to be satisfied that, to feel that F is sufficient/adequate pingodeno = sufficient/adequate 0% epistemic sufficiency, "-nango-": P/F-s nangosi = to be dissatisfied that/with, to feel that F is insufficient/inadequate nangodeno = insufficient/inadequate high deontic necessity, "-getsi-": P/F-s getsisi = to need, to require (note that this is a VERB, not a pure modal!) getsino = in need getsibie = in need of, needing getsideno = necessary getsideda = needs (i.e. what is needed) getsiveda = necessity/need (the need itself) getsivegiu = the necessity/need for P-s getsiseno = needy A/P/F-s getsituesi = to deprive 100% epistemic evidentiality, "-pisna-": P/F-s pisnasi = it is evident/obvious to P that F pisnadesi = evident, obvious A/P/F-d pisnakosi = to prove, to verify pisnakovoino = conclusive ("-voi-" is the double middle CCM) high epistemic evidentiality, "-gesna-": P/F-s gesnasi = it is apparent to P that F gesnadeno = apparent, seeming A/P/F-d gesnakovoino = circumstantial low epistemic evidentiality, "-sosna-": A/P/F-d sosnakovoino = inconclusive 100% deontic inevitability, "-pisko-": P/F-s piskosi = to feel that F is inevitable/fated/ pre-ordained piskodeno = inevitable, unavoidable, pre-ordained high deontic inevitability, "-gesko-": P/F-s geskosi = to feel that F is 'meant to be' geskovuasi = "It was meant to be." ("-vua-" makes all arguments of a verb oblique) Finally, when "-fo" is applied to a modal, it is very similar to a combination of "ma" + "de" + "si", where "-ma-" is the P/F-s verb classifier and "-de-" is the middle CCM. The result is an F-s [-P] verb, which takes an embedded sentence as its only argument. For example, "John pintefo leave" meaning 'John definitely left' is similar to "Pintemadesi John leave" meaning 'It is accepted as true by some unspecified patient that John left'. However, there is an important difference between the two. When "-fo" is used, the resulting word is actually deictic, since the patient MUST be the speaker. When "-madesi" is used, the patient is unspecified and does not necessarily have to be the speaker. Thus, application of "-fo" to modals is completely compatible with its application to register and tense/aspect morphemes. In all such derivations, one or more of the implied arguments is deictic. 18.6 ARE THERE OTHER MODALITIES? Since modalities represent a speaker's judgement about a situation, and since it's possible for people to pass judgement in many different ways, the obvious question is whether we can implement other concepts as modalities, rather than as basic states. I am convinced that the answer is a resounding "YES", even though I've only discussed those modalities that I've read about or which seemed obvious to me. There is no doubt in my mind that other modalities exist, and that these modalities are likely to have formal representation (as inflections, auxiliaries, particles, etc.) within some natural languages. When trying to decide whether a concept is inherently modal in nature, we must keep in mind that a modality represents the speaker's judgement of a SITUATION. The concept must NOT represent the speaker's feelings towards the listener or a third party, nor can it represent an attitude that is CAUSED by a situation, the listener, or a third party; i.e. the speaker must be the source of the judgement. Also, the concept must not represent the state or behavior of an actual ENTITY - it must represent a judgement of a SITUATION. Upon further derivation, of course, the concept WILL represent the mental state of an actual entity (e.g. "to believe" from the modal concept 'true'). Modal concepts are inherently abstract. Normal states are not. Consider the following: yellow vs. odd old vs. true heavy vs. necessary In effect, a modality is not an INHERENT quality of a situation. It can not be objectively measured. Instead, it is externally imposed. Another feature that modal concepts all seem to have in common is that they, unlike normal states, are inherently challengeable because they are inherently subjective. Fortunately, it is definitely possible to test a concept to determine if it is inherently modal in nature. In English, we can test a concept M for modality by using one of the following two sentences: (1) It's M that he left early. (2) It's M that he leave early. If (1) makes sense, then M is an epistemic modality. If (2) makes sense, then M is a deontic modality. However, there are a small number of mental states in English that will pass test (1) even though they are not true modalities. Examples are "sad" and "curious". In order to detect these "false modals", we can apply a third test, as follows: (3) Everyone is M that he left early. If (3) makes sense, then M is NOT a modality, even if it passes test (1). Now, here is a list of other modal concepts that I believe are inherently modal in nature. Each of them passes the above tests: I say/believe that something is: 100% concept 0% concept -------------- -------------- interesting boring welcome unwelcome/imposition fortunate/beneficial unfortunate/detrimental normal odd/strange/abnormal correct/right incorrect/wrong good bad right/just/proper/ wrong/unjust/improper meet/fit acceptable/ unacceptable/ admissible/ inadmissable/ okay/ not okay/ advisable ill-advised in good taste in poor taste Note that morphemes for the last several examples can also be used as the 'good' and 'bad' quality morphemes that we discussed earlier in the section on augmentatives and diminutives. In fact, ALL of the modality morphemes should be useful in word derivation as modifiers of other roots. Natural languages contain many words that not only represent a referent but which also indicate the speaker's opinion or attitude towards the referent. Thus, modal morphemes can be used in the same way as register morphemes; i.e., as both lexical and sentential modifiers. In any case, it seems to me that the above concepts (and certainly others I've missed) are indeed modal in nature and should be treated as such. Another question that we need to answer is this: is it possible that epistemic and deontic concepts come in pairs? In other words, does each epistemic modal have a deontic counterpart and vice versa? For example, what is the deontic counterpart of epistemic goodness? What is the epistemic counterpart of deontic inevitability? And so on. Originally, I thought that they did come in pairs. Here are some examples of the pairings as I conceived them: Epistemic Deontic --------- ------- true mandatory/obligatory adequate/sufficient necessary obvious inevitable significant important good desirable However, while there does seem to be some kind of realtionship between the listed counterparts, I was never able to create precise semantic definitions of one in terms of the other. The only definitions I could come up with were ambiguous and unsatisfying. Finally, it is important not to confuse modal concepts with concepts which describe the inherent, non-subjective characteristics of processes or their results, such as 'easy', 'practical', 'legal', 'efficient', 'successful', and so on. For example, the 'advisability' of a process is inherently subjective, but it's 'practicality' can be determined objectively. Thus, advisability is a modality while practicality is not. Fortunately, rigorous application of the above tests for modality will prevent any mistakes. [Incidentally, I suggest that concepts such as 'practical', 'efficient', 'easy', and so on should NOT be assigned unique morphemes, at least not initially. These concepts are not very basic and I suspect that it will be possible to derive them from more mundane concepts. For example, is the concept 'practical' essentially synonymous with 'doable'? If so, then the word meaning both 'doable' and 'practical' is "zefinuveno", which we derived earlier, and the word meaning 'impractical' is "zefinuvenano". Note that the essential quality/ability CCM "-ve-" is likely to be used in all of these derivations.] 18.7 NON-CLAUSAL MODALITY So far, we've only discussed modals that modify an entire clause. However, it makes very good semantic sense to extend their use by allowing them to modify specific elements within a clause. To do this, the syntax of your AL will have to specify how the position of a modal within a sentence determines its scope. For example, if your AL is purely right-branching (i.e. VSO), then a clausal modal would appear before the verb, and a modal which modified a particular element in the sentence would immediately follow it. In our sample language, an obvious choice is to use "-di", the terminator for previous-word modifiers. Here are some examples using English word order and the interrogative modal morpheme "-kunte-": Kuntefo Billy hit Jimmy = Did Billy hit Jimmy? Billy hit kuntedi Jimmy = Did Billy HIT Jimmy? (or did he just yell at him?) Billy kuntedi hit Jimmy = Was it Billy that hit Jimmy? Billy hit Jimmy kuntedi = Was it Jimmy that Billy hit? Billy broke the window with a hammer kuntedi = Was it a hammer that Billy broke the window with? Other modals can also be used in this way. Here are a few examples: Louise nantefo go to the store = Louise didn't go to the store. Louise nantedi go to the store = It wasn't Louise who went to the store. OR = Louise is not the one who went to the store. Louise go to a store nantedi = It wasn't a STORE that Louise went to. Louise go to that nantedi store = It wasn't THAT store that Louise went to. OR = THAT store is not the one that Louise went to. Louise put the book under nantedi the bed (but on it) = Louise put the book not UNDER the bed (but ON it). Louise pintedi go to the store = It was Louise who went to the store. OR = Louise is the one who went to the store. Louise xentefo bought a lamp = Louise may have bought a lamp. Louise bought a lamp xentedi = It may be a lamp that Louise bought. OR = A lamp is what Louise may have bought. OR = What Louise may have bought is a lamp. Louise pindufo leave tomorrow = Louise has to leave tomorrow. Louise leave tomorrow pindudi = It's TOMORROW that Louise has to leave. And so on. When a single element of a sentence is modified by a modal as in the above examples, linguists refer to the process as _clefting_. However, this terminology is a result of analyzing English syntax, where a single sentence is split or 'cleft' into two clauses in order to achieve both emphasis and contrast. When a modal is used as we did above, the term _clefting_ can no longer apply since we are not actually splitting the sentence. Instead, we can refer to it simply as a form of emphasis that contains a strong element of contrast; i.e. 'contrastive emphasis'. Later, when I discuss TOPICALIZATION, we'll see how to deal with other forms of emphasis. 18.8 HEDGES There are times when we need to modify the modality of an utterance, implying that the situation is true/false/etc in spite of reasons to believe otherwise. Linguists refer to this process as "hedging". Here are some English examples: a. STRICTLY SPEAKING, his answer was correct. b. LOOSELY SPEAKING, a dolphin is a fish. c. TECHNICALLY, a penguin is a bird. d. Bill joined the SO-CALLED "Society for Universal Tolerance". In each case, the capitalized expression either affirms or denies the truth of a sentence or the accuracy of a label while implying that there is reason to think otherwise. Thus, example (a) can be paraphrased as "His answer was correct even though it is not what we expected, or even though there are reasons to feel that it was really incorrect". Example (b) indicates that a dolphin is NOT a fish, even though there are reasons to think otherwise. Example (c) is similar to (a) but implies that there is actual data or proof to support the claim. And example (d) implies that there may be reasons to believe that the name of the society is invalid. Thus, all hedges have two things in common: 1. they affirm or deny the modality of an expression; and 2. they add a semantic element with the meaning 'even though there are reasons to believe otherwise'. In the sample language, all degrees of modality are available separately, so there is no need to implement (1) - we simply need to use the correct degree. In order to implement (2), we can create a root/MCM that can be used as is or can be added to any word giving it the implication 'in spite of reasons to think otherwise'. In the sample language, we will use the root/MCM "-tomba-" for this purpose (default class = P-s). The most useful hedge words will be the adverb "tombape", and the adjective "tombano": Bill joined the tombano "Society for Universal Tolerance". = Bill joined the so-called/self-styled "Society for Universal Tolerance". The Society for Universal Tolerance NOT is_tolerant tombape. = The Society for Universal Tolerance is tolerant in name only. John behaved very well tombape. = John behaved very well contrary to expectations. OR = John behaved very well in spite of reasons to expect different behavior. OR = Actually, John behaved very well. The most common hedges can be formed by adding "tombape" to a sentence with an appropriate modality, which is actually what we did in the previous example. Here are some more examples: 100% probability: pintefo + tombape = strictly speaking, in truth, actually, indeed Example: John pintefo is a decent person tombape. = John IS actually a decent person. 0% probability: nantefo + tombape = In a sense, loosely speaking Example: A dolphin nantefo is a fish tombape. = In a sense, a dolphin is a fish. (literally: A dolphin is NOT a fish even though there are reasons to think otherwise.) It's also possible to apply the hedge MCM to other words. It will be especially useful with open-class state adverbs. For example, using the state root "-veya-" meaning 'real/existent', we can create the verb "veyatombasi" meaning 'in actuality' or 'actually' with the implication that there may be other reasons to believe otherwise or that the event was contrary to expectations. Note that "veyatombasi" is a VERB, and takes an entire embedded sentence as its single argument. In the same way, we can also derive useful equivalents to many English expressions using the word "speaking", such as "technically speaking", "empirically speaking", "frankly speaking", "mathematically speaking", "roughly speaking", "officially speaking", and so on. 18.9 DOES MODALITY HAVE STRUCTURE? When we discussed deictics, we saw how a basic distinction between 1st, 2nd, and 3rd person was able to explain the way deictics are used in natural languages. In other words, there was an underlying structure to deictic concepts. I can't help wondering if modality also has an underlying structure. The characteristics of modality that we've been discussing certainly imply that modality has structure, or at least rules that can be used to identify a concept as inherently modal or not. However, if we knew the structure of modality, we could PREDICT new modalities. With only rules, all we can do is eliminate potential contenders. There is also the question of whether epistemic and deontic concepts come in pairs. My gut feeling tells me that they do, but, so far, I have been unable to define the underlying relationship. In any case, I would definitely give a lot more thought to the basic concepts that underlie modality before implementing it in an AL. The list of modalities that I have provided here is unsatisfying to me because, although it provides testability, it lacks predictability. If modality DOES have structure, then we should be able to discover this structure, formalize it, and use it to predict the existence of OTHER modalities that we might otherwise miss. Of course, it's also possible that modality does NOT have structure, for the same reasons that basic states and actions do not have structure. Fortunately, the system being proposed here does not require that modality have structure, and can work quite well whether such structure exists or not. 19.0 ANAPHORA [This section is a compilation of a few articles I posted to the 'conlang' email discussion list in October 1993. Rather than spend a lot of time re-writing it to make it conform to the general style of this monograph, I decided to be lazy, and am inserting the original material with only some minor editing.] I don't feel that the design of a comprehensive yet simple anaphoric system is especially difficult. All natural languages have one. However, as is typical of natural languages, the anaphoric systems are clouded in idiosyncrasy and irregularity. One of the problems that many people have is that they tend to think of anaphora as belonging to a special, closed class of words. In English, we think of third person pronouns ("he", "she", "it", etc.), demonstratives ("this", "those", etc.), auxiliaries ("be", "have", and "do") and a handful of oddballs ("herself", "each other", "so", "such", etc.) as most of the available anaphora. Here are some examples: I love anchovy ice cream. Do you? (Anaphor: "do") William Shakespeare lived in a small town with his pet rock and his wife Fifi Yokohama. He would not eat veggies, she would not eat vegemite, and IT didn't eat at all. (Anaphora: "his", "he", "she", and "IT") John said he'll definitely attend the class on Creative Suffering. Louise will too. (Anaphor: "will") However, these 'closed class anaphora' are not the only ones. Consider the following: 1. Ten theoretical physicists and eight sanitary engineers attended the seminar. They were constantly heckling them. Obviously, we can't use the anaphora "they" and "them" in the second sentence of (1). Instead, we need something like: 2. The engineers were constantly heckling the physicists. The point, though, is that the words "engineers" and "physicists" in (2) are anaphora, and they can continue to be used as such throughout the remainder of the dialog. Thus, the head word of a phrase is used as a referent for the entire phrase. Since these anaphora are actually nouns, which are open class words, I'll call them _open class anaphora_. Sometimes, especially when writing, we define new open class anaphora explicitly, as in: 3. This contract is between Steven Speedemon (henceforth the first party) and Wendall Whiplash (henceforth the second party)... In (3) the anaphora are explicitly defined as "the first party" and "the second party". But we can also do it in informal writing and speech: 4. Ten computational linguists and ten theoretical linguists attended the seminar. The comps were constantly heckling the theos. Finally, the theos got so angry that they mooned the comps and left. Another common way to create open class anaphora is to use single letters or abbreviations: 5. In discussing the "Best Artificial Language Linguists Ever Designed" (BALLED), the designers forgot that there were many other lingwackos out there, who were out to get BALLED and who would ridicule it at every opportunity. Of course, once an abbreviation becomes recognizable without introduction, it will no longer be an anaphor - it will be a proper noun (like USA, IBM, etc). The major difference between the open (O) and closed (C) classes of anaphora is that the Os tend to keep their referents throughout the discourse, while the referents of the Cs are constantly changing. Thus, the anaphor "BALLED" in (5) will refer to the same thing throughout the dialog, while anaphora such as "he", "do", or "each other" will continually take on new meanings. One other thing should be mentioned. Most anaphora are "backward-referring"; that is, the anaphor refers to something that was mentioned earlier. In English, it is also possible to have "forward-referring" anaphora, as in: 6. After ordering a pint of his favorite ale, Robert was perplexed when the barmaid replied that the fishmonger was next door. The Great English Vowel Shift had begun. In (6) "his" precedes its referent "Robert". Forward-referring anaphora are sometimes called _cataphora_. So, how do you handle anaphora in an AL? In my opinion, the simplest, most natural, and most flexible solution is to use a form of contraction. The result would always be immediately recognizable as an anaphor by its form. The contraction could then be used as an anaphor for the entire phrase from that point on. You could modify this rule to allow the contraction to take on a new meaning if its pattern matches a new and different phrase. Here's how something like this might sound in English: The Sheboygan Bandits and the Milwaukee Dragoons faced off at Lovemud Stadium on Sunday. The Mil-goons beat the She-its out of their expected title. Using the system proposed here, I recommend the implementation of two types of anaphor: simple anaphora and compound anaphora. A simple anaphor will have two parts: the initial morpheme of the head word of the expression it refers to plus a special terminator. A compound anaphor will have three parts: the initial morpheme of the headword plus the initial morpheme of a significant modifier of the headword plus the special terminator. If the initial CV(V) of a morpheme is not itself a terminator, then it may be used instead of the complete morpheme. Here are the special terminators that we will use in the sample language: Noun: -ha Adjective: -ho Verb: -hi Adverb/case tag: -he For example, consider the following noun phrase: veyaneyada xawezoyano daino engineer sanitary ten "ten sanitary engineers" The simple anaphor could be either "veyaha" or "veha" (since "-ve-" is not a terminator), and the compound anaphor could be either "veyaxaweha", "vexaweha", "veyaxaha", "vexaha", "veyadaiha", or "vedaiha". The choice between "xa(we)-" and "dai-" would depend on whether the speaker thought the concept 'sanitary' or 'ten' was more important. In general, simple anaphora will have shorter lifetimes than compound anaphora. In fact, in the sample language, we will adopt the rule that a compound anaphor will never change its referent for the remainder of the discourse or text. [Incidentally, "veyaneyada" was derived from the root "-veya-", meaning 'real' or 'existent' plus the noun classifier "-neya-" to indicate a member of a profession. (The word for 'engineering' would thus be "veyatiwada", where "-tiwa-" is the classifier for fields of endeavor.) The word "xawezoyasi" is the A/P-s verb meaning 'to keep (someone/something) clean', and is the agentive form of the P-s verb "xawesi = xawesesi" meaning 'to be clean'.] Due to its inherent nature, an anaphor cannot (and SHOULD not!) undergo further derivation. Thus, if we need the genitive form of an anaphor, we will be required to use the open adjective "mabie". Fortunately, the normal interpretations of the non-noun forms of an anaphor of a noun phrase are completely useless. Because of this, I suggest that the non-noun forms have a genitive interpretation by default. For example, if the anaphor "veha" refers to 'ten sanitary engineers', we could say something like "veha were angry at veho boss", meaning 'THEY were angry at THEIR boss'. An anaphor of an adverb will probably not be needed very often because the generic anaphor "he" will almost always be adequate. It corresponds to the English expressions "thusly" or "in that way/manner". If an anaphor of a case tag is used, its meaning will include the case tag plus its argument. Similarly, an anaphor of a verb will refer to the verb and all of its arguments. But the anaphor can never take core arguments of its own, although it can be modified by an appropriate modifier, such as an oblique argument or adverb. The same applies to anaphora of case tags. For example, the sentence: John teyokosi his son to swim = John taught his son how to swim. could be immediately followed by: Tehi because he felt that ALL children should learn how to swim. where "tehi" is an anaphor for the complete first sentence. Thus, "tehi" would be translated as 'He did it' or 'He did so'. The case role introduced by "because" modifies "tehi". Forward-referring anaphora (i.e. _cataphora_) are not really necessary in a language. In my opinion, they should not be implemented at all. Earlier, we discussed how to implement reflexive constructions, as in: Samantha looked at herself in the mirror. In a case such as this (i.e. when the reflexive stands alone as opposed to being implemented with a CCM), it is also possible to use an anaphor instead: Samantha looked at saha in the mirror. where "saha" is an appropriate simple anaphor of "Samantha". [We'll discuss how to create proper nouns later.] In general, I feel that it is preferable to use the reflexive construction, if only because it seems to be almost universal among natural languages. Finally, it's interesting to contrast anaphora with deictics. Deictics, as we discussed earlier, are pointers to entities EXTERNAL to the discourse (e.g. this book, there, yesterday, you, then, etc.). Anaphora, however, are pointers to entities INTERNAL to the discourse (e.g. I saw Louise before SHE left, THAT is why she was so upset, IT caused all kinds of problems, etc.). Natural languages often use third person deictics for both functions (e.g. deictic: "Please hand me THAT book" vs. anaphoric: "I knew THAT"). In the system proposed here, deictics and anaphora are completely different, intentionally, because their semantics are completely different. This implies that the speaker should be careful to use deictics only where appropriate. For example, the word "they" is a deictic in "(Speaker points to some people nearby) who are THEY?", while "they" is an anaphor in "I saw Bill and Mary yesterday. THEY just bought a new house." With the system proposed here, third person personal pronouns will hardly ever be necessary. Instead, anaphora will almost always be used in their place. Some people may find this distinction a difficult one to master, especially if their native language allows third person deictics to be used as anaphora. However, the problem is not quite as severe as it may seem. Keep in mind that third person deictics refer to entities other than the speaker or listener. Thus, their meaning automatically INCLUDES any anaphoric referent. It is for this reason that many natural languages use third person deictics as anaphora. In other words, third person referents are usually both internal AND external to the discourse. Thus, either an anaphor or a deictic can be used. However, in the system proposed here, an anaphor is never ambiguous while a third person deictic can definitely be ambiguous. Consider the following: Bill visited John yesterday. He was totally drunk. If you use an anaphor, "he" will have only one possible referent. If you use a deictic, "he" can refer to either "Bill" or "John". It could even refer to someone other than Bill or John. Thus, use of deictics in place of anaphora for third person referents is semantically correct, but may be ambiguous. However, even in cases where ambiguity is unlikely, I feel that use of deictics in place of anaphora should be discouraged. 19.1 ANAPHORA AND DISAMBIGUATION During the discussion that took place on the conlang list, one individual criticized my proposed anaphoric system because it could not deal with the following kind of problem: A dog was attracted to a dog. But its owner kept it away from it. I agree that the proposed system cannot deal with this kind of situation, but I don't understand why anyone would WANT an anaphoric system to be able to deal with it (in our system, the anaphor "it" could only refer to the second dog, since the second dog was the last one mentioned). This kind of situation will only occur when the speaker is being humorous or intentionally ambiguous. As far as I'm concerned, if the speaker wants to have fun, then let him! Also, our system DOES allow for intentional ambiguity, since a deictic can be used in place of an anaphor. Besides, you could always distinguish between "the first dog" and "the second dog", or "the former" and "the latter". In my opinion, this is a non-problem, and I see no reason to implement a solution to it. However, we most certainly CAN deal with a more reasonable version of this sentence, such as: A big dog was attracted to a little dog. But its owner kept it away from it. Using compound anaphora, one possible English-like permutation would be: A big dog was attracted to a little dog. But bi-og's owner kept bi-og away from li-og. One other problem that cropped up in the discussion had to do with resolving the individual referents of a phrase that inherently referred to more than one entity. For example, does the phrase "two identical twins", provide a single referent or a double referent? How about the phrase "box of nuts and bolts" or "ten million civilians"? I strongly feel that a properly designed anaphoric system should be able to provide an unambiguous index to any referent. The system I propose here does this very well. Furthermore, if the referent is ambiguous, then the anaphor should be equally ambiguous. In other words, the anaphoric system should not be given the additional duty of disambiguating an ambiguous expression. Any disambiguation should be handled explicitly by the speaker. Thus, "a dog and a dog" is intentionally ambiguous (in addition to being unnatural). I do not feel that an anaphoric system should be required to resolve an intentional ambiguity. In the case of "two identical twins", only one referent was provided, and the system proposed here can deal with it very well. The referent is "two identical twins", and one possible English-like anaphor would be "id-ins". Now, some people feel that an anaphoric system must also provide an unambiguous index to EACH of the twins (e.g. "he" and "she"). If so, then the anaphoric system must provide an index to a referent that has not even been mentioned. If neither of the twins has been mentioned separately, then the referent does not exist, and I see no reason to provide an index to a non-existent referent. In other words, what some people seem to want is an anaphoric system that can also provide _semantic decomposition_. I do not feel that this should be the purpose of an anaphoric system, even though it is occasionally possible in natural languages. Considering the many, many possible kinds of groupings (twins, clubs, choirs, companies, orchards, boxes of spare parts, etc.), such a system would be very complex, and I'm not even sure if it would be possible. In summary, I feel that an anaphoric system should be rich enough to provide an unambiguous index to any unambiguous referent. Such a system should NOT have the additional duties of disambiguation or semantic decomposition. 20.0 RELATIVE CLAUSES AND RESUMPTIVE PRONOUNS In my earlier essay on syntax, I discussed two kinds of relative clause that are most common among natural languages. The first kind, which is found in a large minority of natural languages (including English), uses a single relative conjunction (e.g. "which", "that", or "who") plus a _gap_, as in the following example: John saw the book WHICH Bill bought (gap). Note that "the book" is the object of the verb "saw", as well as the implied object of the verb "bought". The gap is required by English syntax. The second kind of relative clause, which is found in a slight majority of natural languages, uses a relative conjunction plus a _resumptive pronoun_, as in the following examples: John saw the book WHICH Bill bought IT. Here, the gap is filled by the resumptive pronoun "IT" that refers back to "the book". The use of resumptive pronouns has one disadvantage compared to the use of gaps, but has three advantages. The single disadvantage is that an extra word is needed; i.e. the resumptive pronoun (RP) itself. The advantages are as follows: 1. ANY noun can be relativized, regardless of the function it performs in the embedded sentence, or of the number of functions it performs: Gap: *I saw the car WHOSE driver got thrown from. RP: I saw the car WHICH ITS driver got thrown from IT. Here, "IT" is the resumptive pronoun and has the morphological form of a noun. "ITS" is the possessive form of the resumptive pronoun. 2. ANY noun can be relativized, regardless of how deeply the gap or resumptive pronoun is embedded: Gap: *This is the man WHO Louise bought a car from the same dealer that sold a Cadillac to. RP: This is the man WHO Louise bought a car from the same dealer that sold a Cadillac to HIM. Here, "HIM" is the resumptive pronoun and unambiguously links to "the man". 3. Use of a resumptive pronoun allows it to be combined with other nouns in coordinated structures: RP: I just met this real tall guy WHO my sister dated both HIM and HIS real short brother. In order to deal with the above examples, languages like English must split them up into two or more sentences. For example, the third example would have to be something like this: My sister dated this real tall guy and his real short brother. I just met the tall one. Since I feel that the advantages of resumptive pronouns significantly outweigh the single disadvantage, I will not spend any more time discussing the gap approach. [Incidentally, the use of gaps has a major disadvantage that does not apply at all to the use of resumptive pronouns. Computer parsing of relative clauses using gaps can be extremely complicated, and can often fail completely without even more complicated semantic/contextual processing.] 20.1 IMPLEMENTATION OF RELATIVE CLAUSES We need to create only three basic words to completely implement relative clauses that use resumptive pronouns: a relative conjunction, a resumptive pronoun, and a genitive form of the resumptive pronoun. A relative conjunction simply provides a non-specific link between a noun and the relative clause that modifies it. Thus, it performs exactly the same function as the generic linker "mabie" which we've used extensively so far. The difference, though, is that we've always used it to link two noun phrases. However, there is no reason why the argument of "mabie" cannot be a complete embedded sentence. In other words, the genitive linker "mabie" performs the function of the English genitive preposition "of" when followed by a noun phrase, and performs the function of the English relative conjunctions "that/who/which" when followed by an embedded clause. This is more easily underdstood if we paraphrase these functions as in the following examples: the box "of" toys = the box 'of-the-entity' toys the boy "who" broke the window. = the boy 'of-the-event' he broke the window You can also use the paraphrases "associated-with-the-entity" and "associated- with-the-event". Note that, while this approach may seem odd to speakers of English, it is semantically correct. In fact, many natural languages (most notably Mandarin Chinese) use exactly the same approach. Thus, in the sample language, we will use the following: Relative conjunction: mabie For the resumptive pronoun and its genitive form, we have two choices: we can use anaphora or we can create two invariant particles for use specifically as resumptive pronouns. The sample language will provide both options. And since we've already discussed anaphora, I will concentrate on the use of particles in the remainder of this discussion. Here are the particles we will use: Resumptive pronoun: ka Genitive resumptive pronoun: xaka Here are a few examples using "mabie", "ka", and "xaka": The shirt mabie you want ka is on the bed. = The shirt that you want is on the bed. The police caught the man mabie ka robbed the bank. = The police caught the man who robbed the bank. Here's the hammer mabie he broke the window with ka. = Here's the hammer that he broke the window with. They examined the room mabie the fire started in ka. = They examined the room that the fire started in. Here are some that are impossible to do in English: I saw the car mabie John met the little old lady who built ka in three days. = John met the little old lady who built a car in three days. I saw the car. I just threw out a book mabie John was wondering if I would lend ka and a few others to him. = I just threw out a book. John was wondering if I would lend it and a few others to him. Here are some examples using the genitive form of the resumptive pronoun, "xaka": That's the man whose wife the police just arrested. = That's the man mabie the police just arrested xaka wife. That's the man whose wife was just arrested by the police. = That's the man mabie xaka wife was just arrested by the police. Note that there is never any ambiguity because "mabie" always links to the closest preceding noun, and "ka/xaka" always links to the closest preceding "mabie" that is being used as a relative conjunction. Note also, that "mabie" can be glossed in English as either "who", "which", or "that", depending on it's referent. For a relative clause that is embedded inside another relative clause, there is no real need for a different relativizer or resumptive pronoun - simply let them operate like a push/pop stack. In other words, when a level of embedding is exited, the previous level again becomes effective. Here is an example: *I saw the gun that the bank robber who the police just caught killed the teller with. I saw the gun (mabie the bank robber (mabie the police just caught ka) killed the teller with ka). The parentheses illustrate the nesting levels. Incidentally, linguists refer to this process as _center embedding_. Center embeddings such as the above are generally forbidden in languages that use gaps in relative clauses. However, center embeddings are quite common in languages that use resumptive pronouns, since there is less chance of ambiguity. However, I must caution the reader that all natural languages seem to limit the NUMBER of nesting levels that are allowed. For example, I know of no language that will allow the speaker to embed an additional relative clause inside the innermost clause of the above example. This restriction seems to be due to the increased processing difficulty of such structures. The processing difficulty of relative clauses (regardless of the type or depth of embedding) may possibly be reduced by completely bracketing the relative clause. For example, Persian (also known as Farsi) introduces a relative clause with a particle equivalent to "mabie", but also terminates the relative clause by appending a special morpheme to the last word in the clause. You may want to consider doing this in your own AL. [Incidentally, Persian also uses resumptive pronouns.] Finally, keep in mind that it's really not necessary to use the particles "ka" and "xaka" for resumptive pronouns. Simple anaphora of the noun being modified would be just as effective and, in some cases, would allow more freedom of expression. In fact, many (and perhaps most) natural languages that use resumptive pronouns use the same morphemes that are used for anaphora. Still, there's no reason why you can't allow both approaches. 20.2 NOMINAL RELATIVE CLAUSES Relative clauses can either MODIFY nouns or ACT as nouns. Those that act as nouns are usually called _nominal_ or _headless_ relative clauses. In our sample language, these can be easily implemented by using the open noun form of the relative conjunction rather than the open adjective form. In other words, we can use "magiu" instead of "mabie". Here are a few examples: I know WHO broke the window. = I know magiu ka broke the window. They saw WHAT John brought. = They saw magiu John brought ka. She showed me WHERE the boys went. = She showed me magiu the boys went medope ka. [Here, "medope" is the 'destination' case tag that we derived earlier. Literally, the sentence can be glossed as 'She showed me what the boys went to it'.] He told me WHO he bought it for. = He told me magiu he bought it gupe ka. [Here, "gupe" is the 'beneficiary' case tag that we derived earlier.] You told me WHY you sold it. = You told me magiu you sold it fiape ka. [Here, "fiape" (also "veyavipe") is the 'reason' case tag that we derived earlier.] Bill told me HOW he did it. = Bill told me magiu he did it zejope ka. [Here, "zejope" is the 'means/method' case tag that we derived earlier.] Note that "magiu", the open noun form of the relative conjunction, can be paraphrased as "the person/place/time/thing which" or simply "that which". Thus, for nominal relative clauses, the open noun form of the relative conjunction acts as both the relative conjunction and the argument of the preceding verb. It's also possible to use derivations of the case tags directly, without a relative conjunction. In order to do this, however, we must invert the case tag, convert it to a noun, and then open up its argument structure. For example, the P/F-s locative case tag "mepe" can be paraphrased as 'being at'. Thus, the OPEN F/P-d inverse noun form "mevigiu" means simply 'the location where'. In other words, the argument of the open noun (i.e. the embedded sentence or the patient of the embedded sentence) will be the patient of the inverted locative: She showed me WHERE the boys bought the magazine. = She showed me MEVIGIU the boys bought the magazine. [Literally, this can be glossed as 'She showed me the location where the boys bought the magazine'.] Let's do the same for the other examples that used case tags: He told me WHO he bought it FOR. = He told me GUVIGIU he bought it. [In English, this can be closely rendered as 'He told me the beneficiary for whom he bought it'.] You told me WHY you sold it. = You told me VEYAGIU you sold it. [This sentence can be glossed as 'You told me the reason for your selling it'.] Bill told me HOW he did it. = Bill told me ZEJOVIGIU he did it. [This sentence can be glossed as 'Bill told me the method of his doing it'.] This approach can also be extended to situations that do not have close English counterparts: I saw where he was walking towards. = I saw magiu he was walking meguipe ka. OR = I saw meguivigiu he was walking. [Here, "meguipe" is the 'potential destination' case tag. The second sentence can be glossed as 'I saw the towards-location of his walking' OR 'I saw the place towards which he was walking'.] The astute reader may now be wondering why there is any need AT ALL for relative conjunctions, since we can always use an appropriate OPEN ADJECTIVE in its place. Here is an example: I saw the building that he was walking towards. = I saw the building mabie he was walking meguipe ka. OR = I saw the building meguivibie he was walking. In other words, we can take advantage of the perfect symmetry inherent in the way we are designing case tags. If a case tag can link an argument of a main verb to its own argument, the inverse form can perform the exact reverse operation: link the argument appearing before the case tag to an argument of the sentence that follows it. This is exactly what we did in the last example. Thus, the inverse adjective form can be paraphrased as 'X-which', where "X" is a case tag. Here's one more example: There's the girl that he bought the flowers for. = There's the girl mabie he bought the flowers gupe ka. OR = There's the girl guvibie he bought the flowers. Here, "guvibie" is exactly equivalent to English "for whom". However, at first glance, it may seem that this approach does not lend itself very well to relativizing the subject or object of an embedded clause, because subjects and objects do not require case tags. For these, it seems that we will need to perform a passive or anti-passive operation on the embedded verb and use the appropriate form of the oblique case tag. Here are two examples: The police caught the man who robbed the bank. = The police caught the man mabie ka robbed the bank. OR = The police caught the man nuvibie the bank was robbed. [Here, "nuvibie" is derived from the passive oblique case tag "nupe", which we derived earlier. Thus, "nuvibie" is equivalent to English "by whom".] The shirt that you want is on the bed. = The shirt mabie you want ka is on the bed. OR = The shirt gavibie you want(anti-passive) is on the bed. [Here, "gavibie" is derived from the oblique anti-passive case tag "gape", which we derived earlier. Note that we have to use the anti- passive form of the verb "want", and since English does not have an anti-passive construction, there is no way to truly capture the anti-passive sense. All we can say is that "gavibie" is equivalent to the English "who/which/that" when it refers to the object of the embedded clause.] Thus, it's possible to implement relative clauses without the need for relative conjunctions. However, will doing so force people to use an anti-passive construction on the verb which they may find difficult to master? No. Keep in mind that when using case tags like "nupe" and "gape", it is NOT necessary to mark the verb itself as passive or anti-passive. In fact, as we discussed earlier, doing so is redundant. Thus, in the above two examples, there is no need at all to mark the verb "to rob" as passive or the verb "to want" as anti-passive. Thus, there is really no need to implement the relative conjunction "mabie" and the resumptive pronouns "ka" and "xaka", since their functions can be performed more efficiently by appropriate derivations of case tags. However, the use of relative conjunctions and resumptive pronouns is very common among natural languages, and the AL designer may wish to emulate them. It is also possible, of course, to provide both options, which we are doing here. Finally, relative conjunctions and nominals can take numeric multipliers. Here are a few examples (I've repeated a few numeric morphemes below for easy reference): -vastu- seven -saksi- all, the whole amount -mai- many, a lot, a large amount -zonja- any, one or more, greater than zero He'll tell zonjano magiu ka is there. OR He'll tell mazonjagiu ka is there. = He'll tell whoever is there. I saw maino magiu he owned ka. OR I saw mamaigiu he owned ka. = I saw the many things that he owned. I saw saksino magiu he owned ka. OR I saw masaksigiu he owned ka. = I saw all/everything that he owned. I saw vastuno magiu he brought ka. OR I saw mavastugiu he brought ka. = I saw the seven things that he brought. And so on. In fact, since the relative conjunction is actually an adjective, it can take an adjective modifier (i.e. a previous-word modifier), and since a nominal relative conjunction is actually a noun, it can be modified by an adjective. For example, we could create a sentence such as "I saw maino beautiful magiu she owned ka", meaning 'I saw the many beautiful things that she owned'. 20.3 NON-RESTRICTIVE RELATIVE CLAUSES All of the relative clauses we've discussed so far are typically referred to as _restrictive_ relative clauses, since they 'restrict' or 'reduce' the number of possible referents to the head noun. Some languages, such as English, allow the same form to be used with a non-restrictive sense (but with a noticeable difference in timing and intonation). These clauses simply provide additional information about the head noun. Here are a few examples: Restrictive: The man who robbed the bank... Non-restrictive: The elephant, which is a large animal, ... Restrictive: The mower that is in the garage is broken. Non-restrictive: The mower, which is in the garage, is broken. Since a non-restrictive relative clause is the same as any other kind of parenthetical structure, I feel that it should be treated as such. In my opinion, it should NOT be treated in the same way as a restrictive relative clause, for the simple reason that the two are semantically quite different. [I will discuss how to deal with parenthetical structures later.] 21.0 INTERROGATIVES In the section on modality, we discussed how non-clausal modals could be used to implement English cleft sentences (e.g. "Was it John that just left?"). We now need to address how to implement more general interrogatives, in which the listener is being asked, in effect, to 'fill in a blank'. Here are some English examples: WHO closed the window? WHY did he close the window? HOW did he close the window? WHERE did he close the window? And so on. We also need interrogative modifiers, as in the following: WHICH boy closed the window? WHAT kind of people live here? HOW many people live here? HOW heavy was the box? And so on. In order to create interrogative sentences, we used the interrogative probability modality. However, that modality questions the TRUTH of an event. What we need here is a way to represent an empty argument in the argument structure of a sentence - an argument which the listener is being asked to provide. Since this is quite different from the interrogative probability modality, it must be implemented differently. In my opinion, the easiest way to implement general interrogatives is to create a unique interrogative morpheme which can be used generically or can be made part of larger words. In the sample language, we will use the MCM "-ku-" for this purpose. Thus, the completely generic noun will be "kuda". We will also adopt the convention that whenever "-ku-" is affixed to a verbal stem, it will fill the FINAL argument in the argument structure, making the use of "kuda" unnecessary in that position. We will also assume that, like "-na-", the default verb class of stand-alone "-ku-" is "0". Here are some examples: Who closed the window? = Kuda benzosi the window? [Here, "benzosi" is the A/P-d verb meaning 'to closed'.] What did Billy close? = Billy benzosi kuda? = Billy benzokusi? [In the second example, "-ku-" fills the object (i.e. patient) position of the verb, converting it to an interrogative.] Who closed what? = Kuda benzosi kuda? = Kuda benzokusi? Why did he close the window? = He benzosi the window fiape kuda? = He benzosi the window fiakupe? [Here, "fiape" is the reason case tag. Note how "-ku-" fills the object position of the case tag in the second example.] How did he close the window? = He benzosi the window zejope kuda? = He benzosi the window zejokupe? [Here, "zejope" is the method case tag.] Where did he close the window? = He benzosi the window mepe kuda? = He benzosi the window mekupe? [Here, "mepe" is the locative 'at' case tag.] How heavy is the box? = The box hayumasi kuda? = The box hayumakusi? [Here, we are using the P/F-s verb "hayumasi" meaning 'to weigh'. As we discussed earlier in the chapter on counts and measures, it is the focused version of the P-s verb "hayusi" meaning 'to be heavy'. The stand-alone UNFOCUSED verb "hayukusi" means simply "Which one is heavy?" or "Which ones are heavy?".] What is a duck? = Kuda vevisi guasuda? [Literally, this means 'What is the nature of a duck?', where "-ve-" is the essential quality CCM and "-vi-" is the inverse CCM. Refer to the chapter on CCMs to see how "-vevi-" is derived.] Kusi? = Huh? or What? Other parts-of-speech are also useful. Here are some examples: Which duck(s) closed the window? = Kuno guasuda benzosi the window? = Guasukuda benzosi the window? Whose pencil is this? = This is kuxano pencil? [Here, "-xa-" is the genitive CCM.] What kind of people live here? = Vevikuno people live here? [A more accurate gloss would be 'What qualities or characteristics do the people living here possess?'. See below for additional comments on "kuno" and compare with "vevikuno".] For the English expression 'how many' or 'how much', we need a scalar state root to represent the concept of 'having quantity or amount', just as "-hayu-" represents the concept of 'having heaviness/weight', "-lenga-" represents the concept of 'having (spatial) length', and so on. When the verb is focused, the focus will represent the actual quantity or amount. For example, if the scalar state root "-kanti-" represents the concept 'having quantity/amount', we can do the following: How many boxes are there (= the boxes number how many)? = The boxes kantimasi kuda? = The boxes kantimakusi? How many people live here? = The people who live here kantimasi kuda? = The people who live here kantimakusi? = Kantimakuno people live here? Numeric morphemes can also be used very productively with "-ku-". For example, "kufeda" means 'which one', "kufegeda" means which ones', "kududa" means 'which two', the adjective "kufeno" means 'which single/solitary', etc. [In fact, the non-specific numerics that we created earlier are actually macros of "-kanti-" plus the scalar polarity morphemes. For example, "-mai-", meaning 'many', is actually a macro for "kanti + ge", "-pewa-", meaning 'few', is a macro for "kanti + so", and so on.] Note that "kuno" and the English translation 'which' do not overlap completely. The word "kuno" represents any possible modifiers of the noun. Thus, the sentence "Kuno guasuda benzosi the window?" is more accurately glossed as 'What can you say about the duck or ducks that closed the window?'. The English word "which" sometimes has this more general sense, but it more often has the sense 'which one' or 'which ones'. To obtain this more precise sense in the sample language, we can use "kufeno" or "kufegeno". In the above numeric derivations, we MUST place the numeric morpheme AFTER "-ku-" because of the morphological rules of the sample langauge. In other words, since a morpheme always modifies what appears to its left, "-ku-" adds the meaning 'What can can you say about what appears to the left?. Thus, "fekuno" means 'What can you say about the "oneness" of the noun it modifies?', while "kufeno" means 'What can you say about the noun it modifies, of which there is exactly one?'. Here are some examples: Kufeno duck left early? = Which duck left early? = Which one of the ducks left early? Fekuno guasuda left early? = What do you mean that ONE duck left early? = Just one duck left early? You don't say! Tell me more! Do not confuse "fekuno" and similar derivations with the interrogative modals that we discussed earlier. The modals question the truth of an assertion. Use of "-ku-" does not at all question the truth of the assertion, but simply asks for more information. In summary, "kuda" (or a derivative) occupies the position of a missing word or expression that would have provided more detailed information, while indicating that it should be replaced by something more specific. When it is affixed to a verbal stem, it effectively replaces the FINAL argument in the argument structure with a question mark. When affixed to a non-verbal stem, it asks for more information about the stem. Use of "-ku-" never questions the truth of an assertion (as interrogative modals do). It simply asks for more information. 22.0 MORE ON RELATIONSHIPS There are several very general kinds of relationships that have been heavily studied by semanticians. Their simplest and most basic forms are all P/F-s verbs. I will simply list them and provide examples of their use. By now, potential derivations using these words (and there are MANY of them) should be obvious. Here is a partial list: Equality: P/F-s -> 'to be', 'to be equal to', 'to be the same as' E.g. John is the new president of the company. [This is the verb "kapsusi", which we derived earlier when we discussed the 'state' case role. We also derived a few other useful words from the same root in the section on polarity. The unfocussed P-s adjective has the meaning 'steady' or 'unchanging'.] Equivalence: P/F-s -> 'to be equivalent to', 'to amount to', 'to be comparable to' E.g. The cross-border raid was equivalent to an act of war. Similarity: P/F-s -> 'to be like', 'to be similar to', 'to share/have something in common with' E.g. John is like his father. [This is the verb "losi", which we derived earlier when we discussed the _manner_ case role. We also derived several other useful words from the same root in the section on polarity.] Analogy/Proportionality: P/F-s -> 'to be analogous to', 'to be proportional to' E.g. A dog's relationship to a puppy is analogous to a cat's relationship to a kitten. (i.e. A dog is to a puppy as a cat is to a kitten.) Execution for murder is analogous to fines for petty theft. (i.e. Execution is to murder as a fine is to petty theft.) Volume is proportional to the radius cubed. Paronymy: P/F-s -> 'to be a derivative/derivation of', 'to derive from' E.g. The verb clarify is a derivation of the adjective "clear". Kerosene is a derivative of crude oil. [Incidentally, P is referred to as the _paronym_, while F is referred to as the _base_. The root for this concept in the sample language is simply "-mante-", which is the process result CCM discussed earlier. We also used it to derive the worm meaning 'mathematics'.] Hyponymy: P/F-s -> 'to be a kind of', 'to be an example of', 'to be a way of' Inverse F/P-s -> 'to subsume', 'to include' E.g. A horse is a kind of mammal. Mammals include horses, dogs, and cats. [Incidentally, P is referred to as a _hyponym_ of F, and F is referred to as a _superordinate_ of P. Thus, 'horse' is a hyponym of 'mammal', and 'mammal' is a superordinate of 'horse'.] Relatedness: P/F-s -> 'to be related to', 'to be in the same class as' E.g. Cats are related to dogs, both being mammals. Clams are related to trees, both being living creatures. Hills are related to mountains, hills being smaller. [I'm not sure if this distinction is really necessary, since the similarity relationship seems to cover both concepts.] Compatibility: P/F-s -> 'to be compatible/consistent with' E.g. My views are compatible with yours. His approach is consistent with his earlier work. Constituency: P/F-s -> 'to be part of', 'to be a component of' Inverse F/P-s -> 'to include', 'to contain', 'to have (as a component or part)', 'to be made (up) of', 'to consist of' E.g. A finger is part of the hand. The doghouse is made mostly of plywood. A triangle has exactly three angles. [Do not confuse a constituency relationship with a count/group relationship. Constituency implies that the parts may be quite different in nature, while a group consists of several similar components. For count/group relationships, use the P/F-s numeric derivation "femasi", meaning 'to be one of', which we derived earlier. Incidentally, P is referred to as the _meronym_ of F, while F is referred to as the _holonym_ of P. Thus, 'finger' is a meronym of 'hand', and 'hand' is a holonym of 'finger'.] Supplementation: P/F-s -> 'to be in addition to', 'to be a supplement to' E.g. The money is a supplement to the normal wage. Alternative: P/F-s -> 'to be an alternative to', 'to be a substitute for' E.g. Compromise is the only alternative to war. Alternation: P/F-s -> 'to alternate with', 'to take turns with' E.g. The girls take turns with the boys at the swimming pool. Red flags alternate with blue flags in the row of flagpoles. [Do not confuse 'alternative' with 'alternation'. An alternative is a substitute while an alternate precedes or follows in temporal or locative sequence. For example, a temporal alternative relationship implies a substitute at the same point in time thus negating the first argument, while the alternation relationship implies a sequential substitution.] Precedence/Sequence: P/F-s -> 'to precede', 'to come before/in front of', 'to be earlier in sequence than' Inverse F/P-s -> 'to follow', 'to come after/behind', 'to be later in sequence than' E.g. January precedes February. February follows January. The wrist comes before the hand, the lower arm comes before the wrist, and the elbow comes before the lower arm. Contingency: P/F-s -> 'to be contingent/conditional on', 'to hinge on', 'to depend on' E.g. The success of the project depends on complete cooperation. Inverse F/P-s -> 'to entail/imply' 'He shouted again' entails 'He shouted earlier'. Lightning implies thunder. [Important: do not confuse 'contingency/implication' with 'causation'.] Meaning: P/F-s -> 'to mean', 'to signify', 'to stand for', 'to denote' E.g. The French word "maison" means 'house'. His behavior signifies that he is very angry. Other kinds of general relationships which we've already discussed are the P/F and AP/F generic verbs. Note that NONE of the general relationships represent mensuration or evaluation concepts. Note also that all of the technical labels that we introduced above, such as "paronym", "superordinate", and "meronym", can be easily derived from the active and inverse forms of the corresponding verbs. 22.1 THE VERB "TO HAVE" The generic P/F-s state verb "masi" indicates that a relationship exists, but implies nothing about the nature of the relationship. Thus, its meaning includes ALL of the above P/F-s examples. And, as we discussed earlier, the open adjective form "mabie" provides the genitive sense of the English word "of", the Japanese word "no", the Chinese suffix "-de", the Swahili root "-a", and so on. In its verb form, this generic state relationship completely overlaps the sense of the English verb "to have". To illustrate this, consider the following examples: John has the book. John's book... The project has a new manager. The project's new manager... The house has a red roof. The house's red roof... He has a good reputation. His good reputation... We had problems with the new equipment. Our problems with... John has an answer to your question. John's answer... I had supper at 6 o'clock. My supper... In other words, the semantics of the genitive and the verb "to have" are identical, and encompass much more semantic space than the prototypical sense of 'possession', 'ownership', or 'control'. Thus, the verb "masi" is in most respects the equivalent of the English verb "to have". [The English verb is different when it is used as an auxiliary, and when it is used with the causative sense of "I HAD Joe sweep the garage". Also, the meaning of the word defaults to 'possession/ownership' whenever the actual relationship is not clear from context, and this default appears to be universal among natural langauges.] Earlier in this monograph, when we discussed the genitive CCM "-xa-", we indicated that when "-xa-" is used in root position it would represent the concept of 'having' with a default class of P/F-s. Thus, the words "masi" and "xasi = xamasi" are precisely synonymous. While this may seem a rather useless and redundant application of "-xa-", it CAN have its uses if it undergoes further modification. In other words, when no root is present (as in "masi"), we assume a generic state root (compare with "zesi", where "-ze-" is the generic action root). And since the root is not present, it cannot be further modified. However, when further modification IS necessary, we can use "-xa-". For example, "masi" = "xasi" both mean 'to have', but to indicate stronger or weaker senses of association, we can NOT modify "masi", but we CAN modify "xasi". Thus, "xagesi" means 'to have a lot to so with', "xapisi" means 'to be completely/totally involved with', "xajusi" means 'to be minimally involved with', etc. (The word "xagesi" can also have the sense of the English word "possess" when it does not imply absolute 'control' or 'ownership'.) We will see additional examples below. Now, some languages make a distinction between _alienable_ and _inalienable_ possession. Alienable possession implies that the possession is inherently temporary (e.g. "John's money"), while inalienable possession implies that the possession is typically permanent (e.g. "John's arm"). If you wish, you can implement this distinction by creating an inalienable possessive root and a corresponding CCM. Note, though, that the adjective version of basic nouns already implies the most common type of inalienable possession: "guasuda" = 'duck' 1. "guasuxano parade" OR "parade mabie guasuda" = 'duck parade' or 'duck's parade' 2. "guasuno parade" = 'duck parade' The first example is vague and implies only that the parade was somehow associated with ducks. Thus, the first example could have several meanings, such as that the parade consisted of ducks, that the parade contained people dressed like ducks, that the parade was done for the benefit of ducks, etc. The second example, however, is clearly inalienable, since it means literally 'the parade which BE duck'. In other words, the second example clearly indicates that the parade CONSISTED of 'duck'. Finally, I mentioned earlier that the root morphemes used to derive the above general relationships would be very productive in deriving additional words. Let's do some additional derivation of the most useful relationship - the generic state relationship represented by the verb meaning 'to have': P/F-s masi -> to have P/F-s namasi -> to not have, to be bereft of AP/F-s fisi -> to keep, to retain xapifisi -> to monopolize, to hog AP/F-s nafisi -> to abstain from, to avoid, to refrain from, to forgo P/F-d dosi -> to get, to obtain, to receive, to come into/by P/F-d nadosi -> to lose, to forfeit AP/F-d suasi -> to accept, to take, to obtain xagesuasi -> to secure, to reap AP/F-d nasuasi -> to give away, to get rid of AP/F-p namisi -> to offer, to present A/P/F-d kosi -> to give A/F-d [+P] kogasi -> to transfer, to convey, to give (to) xagekogasi -> to yield up, to relinquish, to give away A/P-d [+F] nakomiusi -> to dispossess, to expropriate A/P-s [+F] natuemiusi -> to deprive Note that I have implemented the above with the same argument structures as the corresponding English verbs. For example, "kogasi" is anti-passive because the English verb "to transfer/convey" is inherently anti-passive. Also note that the above glosses are completely compatible with the generic derivations of chapter 2. For example, the verb "fisi" means 'to keep' not only with the above sense, but also with the sense 'to remain steadfast on' or 'to stick with'. It's important to keep in mind that the above relationships are extremely general. Thus, "suasi" meaning 'to take' has the same wide semantic coverage as the English verb, and can be used in a wide range of contexts. Here are some examples: I took her advice. I took her hand. I took her money. I took the first place in line. I took the elevator to the tenth floor. For the verb meaning 'to transfer', the primary, oblique patient is the entity to which the focus is being transferred. Thus, a sentence like "John kosi the land gape Bill", would mean 'John transferred the land to Bill". [Note that the anti-passive CCM "-ga-" is not needed on the verb because the original patient is being expressed obliquely using the case tag "gape".] For the AP/F-d verb meaning 'to take', the secondary agent-patient can be indicated using the generic 0/AP case tag "piupe", which we discussed earlier. For example, "John suasi the book piupe Tom" would mean 'John took the book FROM Tom'. It would also be possible to use the locative 'from', "menadope", to emphasize that the transfer involved a change of location. 23.0 CONJUNCTIONS A conjunction links two entities or situations, and always provides additional information about the relationship between the items being linked. Also, some conjunctions can be concatenated to link more than two items. Here are a few examples: Louise AND Bill just left. Louise OR Bill OR Mike will give the talk. Bill will go shopping IF Louise wants him to. John just went shopping, BUT he forgot to buy coffee. He bought the book EVEN THOUGH it was very expensive. He was the only one who was sober, SO he had to drive. He finished his homework at 7 PM, AND THEN he went outside to play. Bill missed the target; IN OTHER WORDS, he lost the match. Conjunctions ALWAYS link two expressions of the same syntactic type. For example, if a noun phrase immediately follows a conjunction, the conjunction links it to one or more preceding noun phrases. If a complete sentence immediately follows a conjunction, the conjunction links it to one or more preceding sentences. And so on. Conjunctions can be grouped into the following general categories: Additive: and, also, in addition, besides, furthermore, moreover, similarly, likewise, in the same way, in other words, in conclusion, in summary, etc. Causal: if, then, unless, even if, so, consequently, thus, it follows, because, under the circumstances, for this reason, therefore, etc. Concessive/Adversative: but, and even, in spite of, however, although, albeit, notwithstanding, anyway, nevertheless, even though, regardless, even so, despite, just the same, even now, for all that, still, all the same, yet, whether or not, whatever, no matter what, in fact, as a matter of fact, despite that, on the other hand, etc. Substitutive: or, instead of, rather than, in place of, etc. Temporal: then, next, after that, on another occasion, an hour later, finally, afterwards, before that, at last, at the same time, subsequently, etc. Continuatives/Cohesives: uh, now, well, anyway, okay, at any rate, in any case, etc. [Incidentally, the above categories reflect LINGUISTIC/DISCOURSE distinctions based on actual usage in natural language, as opposed to LOGICAL distinctions. Logicians categorize conjunctions quite differently, and, in the process, end up excluding words and expressions that are truly conjunctive in nature, or end up restricting their meanings more than natural languages do. For example, most logicians and formal semanticians would not consider expressions such as "in other words", "afterwards", "on the other hand", and "anyway" as actual conjunctions, because they do not perform basic logical operations on truth conditions. In natural language, however, these ARE conjunctions and they perform important conjunctive discourse functions.] Conjunctions are interesting because of their large numbers and because of the great variety of relationships that they represent. Also, the vast majority of them are derived from basic, open class words. Thus, while conjunctions DO perform a function that is quite different from verbs, nouns, adjectives, etc., their meanings include the concepts of many of these words. 23.1 IMPLEMENTING CONJUNCTIONS Conjunctions fall into two general categories depending on how they are used in discourse. 1. True conjunctions. These always link a constituent which follows the conjunction with the closest preceding constituent of the same type. The linkage is thus syntactically precise. Examples: and, or, unless, if, etc. 2. Disjuncts. These only loosely link a constituent which follows the conjunction with one or more of the preceding constituents of the same type. The syntactic linkage is often vague. Examples: but, on the other hand, also, in other words, despite that, etc. To illustrate the difference between true conjunctions and disjuncts, consider the following: The project was over-budget and under-staffed. The project manager was a political hack and his choice for a tech lead was a bureaucrat who could barely spell his name. Three of the engineers and four of the secretaries were sick most of the time. To make matters worse, the technicians had to spend most of their time on another project that had higher priority and more adequate funding. But the project was a great success. Notice how "and" precisely links its arguments, creating new constituents of the same syntactic type. The syntax of the linkage is not in doubt. But there IS doubt about the linkage of the word "but". Does it link to the immediately preceding sentence, to the preceding two sentences, or to the entire preceding paragraph? If "but" were a true conjunction, there would be no doubt about which items were being linked. In effect, the semantics of "but" in the above example is not compatible with the syntax of a true conjunction since the linkage is not clear. The actual linkage can only be determined through context. Now, since true conjunctions and disjuncts are syntactically distinct, we must treat them as distinct syntactic entities; i.e. we must give them different parts-of-speech in the sample language. And since true conjunctions are relatively rare, the obvious solution is to create unique particles (terminator "-ka") for each one. But how do we handle disjuncts? Fortunately, there is a way of achieving a disjunctive effect within the existing framework. Earlier, we discussed how a speaker could express his feelings or attitudes about an event by using a verb that takes an entire sentence as its single argument. Here's an example: P/F-s I hope that I'll win. F-s [-P] Hopefully I'll win. where "hopefully" is actually a verb that takes a complete embedded sentence as an argument - it is NOT an adverb as in English. As I stated earlier, words and expressions like these are called _disjuncts_, and many other examples can be derived in the same way: "to presume" -> "presumably", "to be interesting" -> "interestingly", "to be possible" -> "possibly", "to be incidental" -> "incidentally, by the way", "to be necessary" -> "necessarily", "to be fortunate" -> "fortunately", etc. The same approach can be used to create disjunctive conjunctions whose scope is not precise. For these, however, we must demote the SECOND argument of the verb rather than the first argument using an anti-middle construction (CCM = "-xi-"). Here is an example in the sample language: P/F-s: The new project is similar to the previous one. where "losi = lomasi" = 'to be similar to' P-s [-F]: "Loxisi" = 'Similarly, ...', 'Likewise...', 'In like manner', etc. Alternatively, we can use the terminator "-fo", which we used earlier for sentential register, tense/aspect words, and modal words. When used with other roots, it will apply the corresponding state to the entire sentence or clause. Thus, the above example could also be: "Lofo" = 'Similarly, ...', 'Likewise...', 'In like manner', etc. Note that this usage of "-fo" is perfectly consistent with its use with register, tense/aspect, and modal morphemes. Again, the implied argument is inherently deictic, since it can always be determined from the speech situation. Here are some more English examples: P/F-s: The bazaar was in addition to the car wash. P-s [-F]: Additionally, ... P/F-s: The land swap was an alternative to continued violence. P-s [-F]: Alternatively, ... P/F-s: The accident occurred after the party. P-s [-F]: Afterwards, ... P/F-s: His odd behavior meant that he was angry. P-s [-F]: In other words, ... P/F-s: Red flags alternated with white ones. P-s [-F]: On the other hand, ... As with other disjuncts, the P-s [-F] word is a VERB that has the part-of- speech terminator "-si" or a clause modifier that has the part-of-speech terminator "-fo". It is NOT a true conjunction ending with "-ka"! The disjunct takes the entire sentence that follows as its single argument. In sum, a true conjunction should be used only when its linkage is clear. This will almost always be the case when the items being linked are part of the same sentence. In general, a disjunct should be used to introduce a sentence that is only loosely linked to the preceding ones. Now, most of the basic relationships we discussed in the previous section will provide the roots for the most common conjunctions. For example, the conjunction meaning 'and' will be formed from the root for the 'supplementation' relationship plus the terminator "-ka". Similarly, the conjunction meaning 'or/instead' will be derived from the 'alternative' relationship. The conjunctions meaning 'if/then' will be derived from the 'contingency' relationship. And so on. The same roots can also be used to derive disjunctive verbs and adverbs. For example, the root used to derive the true conjunction meaning 'and' can also be used to derive the disjunct meaning 'additionally' and the simple adverb meaning 'too/also'. The disjunct meaning 'but/still' can be derived from the 'incompatibility' relationship. And so on. [Incidentally, you may be tempted to derive the conjunction meaning 'but/still' from the modality meaning 'even', since the modal implies that what happens is unexpected or incompatible with the norm. However, this would not be correct because the two concepts are really distinct, although one often implies the other. The basic relationship actually COMPARES the patient with the focus, while the modal describes the patient's ATTITUDE about the focus. However, the derivation from the modal remains useful, and would mean something like '(and/but) unexpectedly' or '(and/but) surprisingly'.] Some of the more complex temporal combinations (e.g. "three days later", "on a different occasion", etc.) can be implemented using a combination of numeric and temporal roots/MCMs. However, expressions like these are almost never true conjunctions or disjuncts - they are usually topicalized components of the sentence which follows. I will discuss how to deal with this later, in the section on _topicalization_. Finally, keep in mind, that these relationships have verb forms which may actually be easier to use than their true conjunctive or disjunctive forms. For example, the P/F-s verb form of the contingency relationship (or its inverse) can be used in all 'if/then' situations. Here's an example: (the robber will go to prison) is contingent on (he is guilty) = The robber will go to prison if he is guilty. or (the robber is guilty) entails (he will go to prison) = If the robber is guilty, then he will go to prison. The actual form of the embedded sentences will, of course, depend on the sytnax of the AL. 23.2 NOTES ON USING CONJUNCTIONS Languages often use different conjunctions in different environments, even though they represent the same semantics. For example, English favors "and" for linking noun phrases and clauses, while preferring "also" when linking sentences. Also, English never uses "but" to link noun phrases. Instead, it will split the sentence into two clauses and link them with "but...too/also": *John but Bill helped the children. John helped the children, but Bill helped them too. In my opinion, there is no need to duplicate the selectional preferences of a particular natural language. Thus, I WOULD allow the following: John but Bill saw Louise. = John saw Louise, but Bill saw her also. John if Bill will go shopping. = John will go shopping if Bill will go too. John even though Bill went shopping. = John went shopping even though Bill also went. And so on. Note though, that the above approach can only be used with true conjunctions (i.e., those terminated by "-ka"). It cannot be used with disjuncts. 23.3 REGISTER VARIATIONS FOR DISJUNCTS There are many different disjuncts that have essentially the same meanings, but which are used in different settings. Natural languages differ widely in the number and nature of these expressions. Fortunately, we can capture these distinctions without having to arbitrarily create words that will have few close counterparts in other languages. We can do this by simply changing the speech register of the more basic disjuncts by using the register MCMs we discussed earlier. Here are a few examples: from 'and/also/too' informal -> 'besides' formal -> 'in addition' very formal -> 'furthermore', 'moreover' from 'but/still' informal -> 'whatever', 'even so', 'for all that' formal -> 'though', 'although', 'however' very formal -> 'nevertheless', 'regardless', 'notwithstanding' from 'even though' formal -> 'despite that' very formal -> 'in spite of the fact that' from 'well/so' informal -> 'okay', 'so anyway', 'so anyhow', 'anyway', 'anyhow', 'okay then' formal -> 'now', 'in any case' very formal -> 'at any rate', 'in any event' from 'then (= thus)' informal -> 'because of this', 'for this reason' formal -> 'thus', 'therefore' very formal -> 'it follows therefore that', 'consequently', 'hence' And so on. The actual distinctions between informal, formal, etc. will vary somewhat from one person to another, and the above examples reflect my own (subjective) conclusions. (Actually, I doubt if it's possible to PRECISELY define the semantics of these register differences.) 23.4 COORDINATION AMBIGUITY Conjunctions can be used to solve problems that sometimes show up if the syntax of your AL is strict and unambiguous. For example, if your syntax requires a relative clause to always attach to the closest preceding noun, you would not be able to render the following as a single sentence: I told him about the chicken that we had for supper that was killed by a coyote. If the syntax of your AL is strict (as it is in the sample language), then the relative clause "that was killed by a coyote" would modify the noun "supper", which is nonsense. With a conjunction, however, the problem disappears: I told him about the chicken that we had for supper AND that was killed by a coyote. Here, "AND" links the two "that" clauses so that both modify "chicken". If a relative clause modifies a noun phrase that is part of a coordinated pair, the linkage may be ambiguous. Consider the following: 1. The boy and (the girl who ran away)... 2. (The boy and the girl) who ran away... Personally, I would define the syntax of my AL so that relative clauses would modify only the single closest noun phrase by default, and so that conjunctions would link the following item with the closest preceding item of the same type. Thus, without further information, number 1 would be the only possible interpretation. If you want the relative clause to apply to the compound phrase, you could modify the relative conjunction with a modifier meaning 'both' or 'all', or something similar. However, this is not a very good solution, since parsing success will now depend on the MEANING of the words in addition to the syntactic relationship between them. If you want your AL to be computer- tractable, parsing must depend ONLY on syntax. Fortunately, there is a solution that is natural and easy to implement. First, though, consider some English examples: The boy and the girl, both of whom ran away, ... Jim, Bob, and Joe, all three of whom were in the accident, ... In effect, the expressions "both of" and "all three of" terminate the coordinated structure and allow further modification. Thus, a simple, comprehensive, and versatile solution would be to create a single terminating particle that can undergo further modification. In the sample language, I will use the particle "saksika" for this purpose, with the general sense 'all (of whom)'. Here are some examples: The boy and the girl with the red balloons... -> Only the girl has the red balloons. The boy and the girl saksika with the red balloons... = The boy and the girl, both of whom have red balloons... It's important to keep in mind that ambiguities, such as in the first English example, are often language-dependent. In other words, if you translate a sentence from language X to language Y, it may be ambiguous in one language but not in the other. However, as a language designer, you have a choice - you can eliminate ambiguity by simply not allowing it to exist. And you can do that by providing only one possible interpretation for a particular structure. Thus, the first example above is ambiguous in English but NOT in the sample language. In the sample language, only the girl has the balloons. In order to indicate that both the boy and the girl have balloons, the particle "saksika" MUST be used. Now, consider the following two sentences, and note how the parentheses indicate how the constituents are grouped based on their most likely interpretations: (The boy with the red hat) and (the girl with the puppy)... The boy with ((the lunchpail) and (the book with the missing cover))... Syntactically, the two examples are identical, but a human listener would group the constituents differently. In the sample language, the adjectival phrase "with a missing cover" modifies the noun "book", and the conjunction "and" links the noun phrases "the lunchpail" and "the book with a missing cover". Thus, the grouping shown in the second example is correct, while the grouping shown in the first example is wrong. The reason why the first example is not ambiguous in English is because it's the only grouping that makes sense. However, it is possible for the same structure to be ambiguous, as in the following example: I just looked at the room with the new computer and the modem with the bad ICs. Is the modem in the same room as the computer? In the sample language, the answer is "yes", but in English the sentence is ambiguous. Does the computer also have bad ICs? In the sample language, only the modem has bad ICs, but in English it is not clear. In English, the sentence is doubly ambiguous, not only because attachment on the right is ambiguous, but also because we're not sure where the coordinated structure begins. Does it begin with "the room" or does it begin with "the new computer"? Now, the sample language is not ambiguous. Without "saksika", only the modem has bad ICs. With "saksika", both the computer and the modem have bad ICs. Also, in the sample language, there is no doubt that both the computer and modem are in the same room. How, though, can we indicate that they are NOT in the same room? Again, let's look at how English can do it: I just looked at both the room with the new computer and the modem with the bad ICs. Thus, we can achieve the desired effect by beginning the coordinated structure with the word "both". We can provide ourselves with the same capability in the sample language by creating a single particle that, when used, will mark the beginning of a coordinated structure. When it is not used, attachment will always be to the closest preceding constituent of the same type. In the sample language, we will use the particle "ceka" for this purpose. It's important to note that the coordination initiator "ceka" and the coordination terminator "saksika" will only be necessary when the default interpretation is not the desired one. And since most coordinated structures are relatively simple, these particles will probably not be needed very often. 23.5 PARENTHETICAL EXPRESSIONS AND QUOTING Parenthetical expressions which elaborate or exemplify a concept sometimes use conjunctions, but not always. Here are some examples in English: Some people, SUCH AS JOHN, BOB, AND MIKE, had to leave early. Many birds do not fly south for the winter (E.G. SPARROWS AND PIGEONS). The house needed certain repairs, such as TO THE ROOF AND TO THE CHIMNEY. The man who managed the finance department, BILL JOHNSON, also managed the marketing department. The single disadvantage (I.E. THE HIGHER COST) will probably kill the project. John Smith, WHO JUST FILED FOR BANKRUPTCY, recently moved to Texas. The actual form of such constructions will depend heavily on the syntax of the AL. I would suggest, though, that normal conjunctions be avoided. Instead, I would create pairs of start/stop particles (using macro forms). In the sample language, we will use the following: suka -> start particle for a parenthetical expression toka -> stop particle for a parenthetical expression The start particle will introduce a list of one or more items and the stop particle will terminate the list. [These particles would correspond to pauses used in speech, or commas and parentheses used in writing.] If a list has more than one item, then all the items must be constituents of the same type. For example, a list could contain several noun phrases OR several prepositional phrases, but a single list could not contain BOTH noun and prepositional phrases. Note that these particles can often be used in the same way that English uses quotes in writing or the words "quote" and "unquote" in speech. Here are some examples: The novel "Stranger in a Strange Land" was very good. I heard him speak the words "Rubba Dub Dub" at least three times. However, as currently defined, "suka" and "toka" do not delimit a complete, standalone argument - they simply exemplify an existing argument. Thus, in the above examples, "novel" and "word" are the actual arguments, and the items delimited by "suka" and "toka" only exemplify the actual arguments. In effect, the quoted material modifies the argument. There will be times, though, when we will need to create a standalone argument consisting only of quoted material. This material could even be in a different language. Here are some examples: "Stranger in a Strange Land" was very good. I heard him say "Rubba Dub Dub" at least three times. Note the difference with the earlier examples. In the earlier examples, the particles delimited something which modified an actual argument. But in the last two examples, the particles DEFINE an actual argument. Obviously, if the syntax of your language must be unambiguous and computer-tractable, you cannot use the same mechanism for both types of quoting. There are two possible solutions: 1. create a second set of particles or, 2. let "suka" and "toka" delimit a standalone argument and, when necessary, make IT the argument of what it is modifying or exemplifying. Creating a second set of particles is really overkill, so we will adopt the second approach. With this approach, we would make distinctions as follows: I heard him speak the ON-words suka Rubba Dub Dub toka at least three times. I heard him say suka Rubba Dub Dub toka at least three times. In the first sentence, "ON-words" is the open noun form of "words" (terminator = "-giu"), and the expression delimited by "suka" and "toka" is an argument of the open noun. In the second sentence, the expression delimited by "suka" and "toka" is a standalone argument of the verb "say". Finally, there is no reason why quoted material must itself be parseable. In fact, it could even be in a different language. The only requirement, of course, is that the terminating particle "toka" must not appear inside the quoted material. In the extreme rare case where "toka" itself needs to be quoted, some form of periphrasis should be used instead. 23.6 INCOMPLETE COORDINATION Sometimes we need to coordinate more than one item, but do not want to list all items that are applicable. In English, we use expressions such as "etcetera" and "and so on" to do this. In the sample language, we will use the particle "mika" for this purpose. In effect, it terminates a list of coordinated items, and indicates that the list is incomplete. It may also be used to terminate a list of items in a parenthetical expression in place of the stop particle, if the list is incomplete. 23.7 ANAPHORA OF COORDINATED STRUCTURES There will be times when an anaphor of a coordinated structure will be needed. Here are a few examples: a. The engineer and his assistent just left. THEY had to go to work. b. The windows broke and a wall fell in. IT was a terrible experience. In the sample language, a simple anaphor of a coordinated structure will be formed from the first morpheme of the first conjunction, plus the appropriate terminator. A compound anaphor will be formed from the first morpheme of the conjunction, plus the first morpheme of one of the coordinated items (preferably the first one), plus the appropriate terminator. For example, if the word for 'and' is "neka", then the simple anaphor meaning "THEY" in (a) will be "neha". If the word for "engineer" is "veyaneyada", then the compound anaphor for "THEY" in (a) will be "neveha". 23.8 CONDITIONAL CLAUSES When one event is conditional upon another, English normally links the events with an "if...then" construction, as in the following example: If the law is passed, (then) tax forms will be simpler. And, as we discussed earlier, conditional clauses can be implemented using conjunctions derived from the contingency/entailment relationship or by directly using the contingency/entailment verbs. There are times, though, when we would like to mark a clause as conditional without being forced to create an "if...then" structure. Consider the following two sentences: Tax forms WILL be simpler under the new law. Tax forms WOULD be simpler under the new law. The first implies that the law is or will be in force. The second implies that the law MAY be in force; i.e., that passage of the law is still hypothetical. So, how do we implement the concept represented by the English word "would"? It is clearly not aspectual or modal (at least not within the framework of this monograph). Actually, in light of our discussion in the previous section, the solution should be farily obvious. The word with the meaning of the English word "would" is simply the disjunct form of the contingency relationship. For example, if the root for the contingency relationship is "-pau-" (default class = P/F-s), then we can do the following (the actual forms of the items in parentheses will depend on the syntax of the AL): Verb P/F-s: (Tax forms are simpler) pausi (the new law passes) = Having simpler tax forms is contingent upon/depends on passage of the new law. Inverse F/P-s: (The new law passes) pauvisi (tax forms are simpler) = Passage of the new law entails/means simpler tax forms. Conjunction "pauka" = 'then': (The new law passes) pauka (tax forms will be simpler) = If the new law is passed, then tax forms will be simpler. [Note that the word meaning "if" cannot be used here.] Conjunction "pauvika" = 'if': (Tax forms will be simpler) pauvika (the new law passes) = Tax forms will be simpler if the new law passes. Disjunct "pauxisi" or "paufo": Pauxisi (tax forms are simpler with the new law) = Tax forms would be simpler with the new law. Pauxisi (tax forms were simpler with the new law) = Tax forms would have been simpler with the new law. Since "pauxisi/paufo" simply marks its argument as 'hypothetical', it can also be used to represent English "if = whether" in subordinate clauses and complements, as in the following example: I don't know paufo John will be coming = I don't know if/whether John will be coming. Note that "pauxisi/paufo" simply indicates that its argument is conditional on some unspecified condition, and that the argument is hypothetical. Thus, it makes no sense to apply a tense to "pauxisi/paufo" itself. The tense must be supplied by the argument of "pauxisi/paufo". In this respect, "pauxisi/paufo" itself behaves just like a modal or tense/aspect word. 24.0 COMPOUNDING AND INCORPORATION Compounds are single words or simple expressions that represent unique concepts, but which are formed by combining two or more root morphemes. There are three kinds of compounds: 1. Compounds which represent the sum of their components (i.e., both components are present): to test-fly = to test AND to fly also drop-kick, stir-fry 2. Compounds in which one root is the argument (core or oblique) of the other root: watchmaker = X makes watch (argument = object) also mousetrap, fly swatter, housecleaning, blood test(er) Compounds of this type can also be created using verbs that are derived from basic nouns: baby oil (= X 'oils' baby), dish towel (= X 'towels' dish), doghouse (= X 'houses' dog), towel rack (= X 'racks' towel), dancehall (= X 'halls' dance), water skis (= X 'skis' water), snowshoes (= X 'shoes' snow), etc. rescue team = team rescues Y (argument = subject) also team rescue, student association, fan club, manmade [Note that the grammatical voice of the verb meaning 'rescue' determines whether the interpretation is 'rescue team' or 'team rescue'.] college education = X educates Y in/at college (argument = oblique locative) also beach party, mountain warfare, barn dance, city life spring showers = it rains DURING spring battle fatigue, evening prayers, marital sex, night flight to towel dry = X dries Y using towel (argument = oblique instrument) also steam iron, to water cool, handwriting, windmill to backpeddle = X peddles backwards (argument = oblique method/manner) also to sidestep, freestanding, to dog-paddle, to bunny-hop And so on. Many more oblique relationships are possible. 3. Compounds in which BOTH roots are core arguments of an IMPLIED verb: bedsore = bed CAUSES sore also disease germ, storm damage, tear gas, birth pain [Note that the INVERSE sense of the verb "cause" is used for "disease germ" and "tear gas".] tax laws = laws BEING_FOCUSED_ON taxes also murder investigation, UFO sighting, food requirements houseboat = boat BEING-THE-SAME-AS house also dungheap, girl friend, infantry battalion, snowball [Note that this group could also be considered as the noun equivalent to verb compounds like "stir-fry" mentioned above, since both components are present.] penknife = knife BEING_SIMILAR_TO/RESEMBLING pen also handlebar mustache, birdbrain, mother church, hamhanded mountain village = village BEING_LOCATED_IN/AT mountains also pocket watch, doorstep, hill people, farm house, bedpan also inverses silver mine, goldfish pond, racetrack olive oil = oil BEING_A_DERIVATIVE_OF olives also solar energy, buffalo hide, wood pulp, cane sugar also inverses meat calf, milk cow, sugar cane, pulp wood toolbox = box HAVING tools also apple pie, bedroom, art museum, pea pod, salt marsh also inverses lemon peel, student power, door knob, windowpane And so on. There may be others that fall into this category. However, if there are, I doubt there are very many of them. Note that many compounds can appear in more than one category. For example, "tree nursery" can be derived from "X GROWS trees AT nursery" or the inverse of "trees BEING_LOCATED_IN nursery". The compound "towel rack" can be derived from "X places towel ON rack" or the inverse of "towel BEING_LOCATED_ON rack". It is important to keep this in mind, since it's possible that one version may be implemented more efficiently than another, even though they have essentially the same meanings. Also, some are more specific, and thus less useful, than others. [Incidentally, Mandarin Chinese has many compounds in which each component means essentially the SAME THING. However, since most Chinese morphemes have several meanings, using just one would be ambiguous. By using two with the same or close meanings, the result is a word whose meaning is the meaning that the two components have in common. In a properly designed AL, this type of compound is totally unnecessary.] 24.1 IMPLEMENTING COMPOUNDS Some languages implement compounds by simply juxtaposing complete words (e.g. English, Chinese, Indonesian, and Quechua). Unfortunately, this approach is useless if you want the resulting compounds to be semantically precise. (By "precise" I mean 'as precise as the inherent precision of the basic components will allow'.) For example, what is the relationship between "house" and "boat" in the word "houseboat"? What is the relationship between "house" and "maid" in the word "housemaid"? Obviously, the relationships are different. Another way to implement compounds is to use a combination of a head word and a morphologically correct modifier (e.g. English adjective-noun compounds "solar panel", "marital sex", "marine life", "academic transfer", etc.). English uses this approach occasionally, French uses it more often, while Russian and Arabic use it quite often. In general, a language is more likely to use this approach if it has a regular and productive way to convert words from one part-of-speech to another. However, while the semantics of this kind of construction is more precise than simple juxtaposition, it can still be ambiguous. In many languages, ambiguity is somewhat reduced by using linking morphemes such as English prepositions. Swahili uses this approach for all of its compounds, and French uses it for most (French examples: "salle à manger", "eau de toilette", "film en couleurs", etc.). English uses it occasionally, as in "son-in-law", "hand-to-hand", and "bed of nails". Note, though, that these linking words can be very vague and their use is often idiosyncratic. If we want the semantics of our compounds to be precise, then the semantics of the linkers must also be precise. With the above comments in mind, let's look again at each type of compound and ask ourselves the following questions: a. Do we already have a way to implement this type of compound? b. If not, what new technique should we create to do it? As I will show below, the answer to question "a" is always "yes", making question "b" unnecessary. Here goes... 1. Verb-verb compounds Compounds similar to English "stir-fry" seem to be quite rare among natural languages. The only languages I know of that use them frequently are Chinese and a few others that make extensive use of serial verb constructions. My recommendation is to implement verb-verb compounds ONLY if the syntax of your AL can unambiguously handle serial verb constructions. If not, then your AL should require the use of an explicit conjunction, as in "to stir AND fry" or "to drop THEN kick". In our sample language we have both options: we can use conjunctions, or we can create case tags and adverbs that perform the same semantic function as serial verbs. 2. Compounds in which one root is the argument of the other root We can often accomplish this in our sample language by 'opening up' the argument structure of nouns and adjectives derived from verbs. Here are two examples using words we've already created: duck student = "teyomigiu guasuda" = 'studier of ducks' where noun "teyomida" = 'student', and "guasuda" = 'duck'. [Remember, we open up the argument structure of a normally 'closed' noun by terminating the word with "-giu" instead of "-da".] duck student = "guasuno teyomida" = 'a student who is duck' [Here, there is no need to 'open up' the noun "student" to make the subject position available for use. Instead, we simply use the adjective version of the noun meaning 'duck'.] For compounds in which one component is an OBLIQUE argument of the other, there is more than one possible approach. For example, a compound like "mountain village" can be implemented as "village mepe mountains", which literally means 'village in the mountains'. (Incidentally, this is the way that most natural languages would form such a compound.) Another way is to use a verb form of one of the components. Here is a complete derivation: -ke- state root meaning 'high' -nai- noun classifier for natural location kenaida 'mountain' -xoya- state root meaning 'alive/living' -te- noun classifier for artificial location xoyateda 'town' -so- "smaller" diminutive CCM xoyatesoda 'village' A/P-d: kenaipusi = to 'mountain' something; i.e. to cause something to come together with 'mountain' P-s: kenaisesi = to be 'mountained' Thus, "kenaiseno xoyatesoda" = 'mountain village' All locative noun-noun compounds, including inverses such as "silver mine", can be implemented in this way. We can also create a vaguer compound by using the open noun version of the word for 'village'. The result is "xoyatesogiu kenaida", which can be paraphrased 'village of mountain' Finally, many compounds are really not necessary. For example, the English word "backpeddle" can be just as easily implemented as "to peddle backwards", where "backwards" is a basic adverb. 3. Compounds in which BOTH roots are core arguments of an IMPLIED verb Again, our sample language already has this capability. Here are a few examples using words we've already created: student teacher = "teyomino teyokoda" = 'teacher who is a student' where noun "teyokoda" = 'teacher', and noun "teyomida" = 'student'. duck reservoir = "guasuseno guateda" where P-s verb "guasusesi" = 'to be populated by ducks', and "guateda" = 'reservoir'. snow duck = "xumpifano guasuda" = 'duck which is snow' (cf. "snowman", "snowball", etc.) where "xumpifada" = mass noun 'snow', and "guasuda" = 'duck'. Note that most of the above are just adjective-noun compounds, where the basic relationship is not stated separately, but is the result of normal derivational rules. Our sample language can create many compounds this way, as is commonly done in languages such as French, Russian, and Arabic, but with true semantic precision. Now, let's create some compounds in which the relationship must be indicated by a separate word: hydrology book = book mabie guatiwada = 'book focused_on hydrology' where "guatiwada" = 'hydrology', "mabie" = generic P/F-s open adjective. [Note that we could also have used the open form of the P/F-s derivation of the word for 'book', with "guatiwada" as its argument.] penknife = knife lobie pen = 'knife being_similar_to pen' where "losi = lomasi" is the P/F-s verb 'to be similar to' [This compound can also be implemented as "knife-like pen", where "-like" is the root/MCM "-lo-".] silver mine = mine mevibie silver = 'mine being_the_location_of silver' where "mevibie" is derived from the inverse of the P/F-s verb meaning 'to be located in/at'. [Incidentally, "silver mine" can be rendered more efficiently using the same approaches that we used above for "mountain village"; i.e. "silvered mine" or "mine of silver". It can also be implemented as "mine containing silver", where "containing" is derived from the basic constituency relationship we discussed earlier. In fact, the MOST efficient as well as most general derivation would be the noun for 'mine' modified by the simple adjective version of 'silver'. This construction states that the 'mine' IS silver. However, there is no reason to insist that the 'mine' must consist ENTIRELY of silver in order to use this construction.] And so on. These compounds are similar to Swahili compounds and most French compounds, but are semantically precise. English often creates similar constructions, such as "blood-sucking mosquitos", "swamp-dwelling amphibians", "man-eating tigers", "house-cleaning lady", etc. In these, however, only the hyphenated part of the construction is usually classified as a compound. Thus, since ANY relationship can be expressed by a transitive verb, and since ANY transitive verb can be converted to an open adjective, there is no limit on the number of compounds that can be created with semantic precision. Finally, the approach we are using here allows us to create many useful compounds that, in a language like English, would be either ambiguous or even impossible to create. For example, the English compound "woman teacher" could mean 'woman who teaches', 'teacher of woman', 'teacher who focuses on women', 'one who teaches like a woman', etc. With the system proposed here, we can create more compounds, and their meanings are always obvious. This ability is especially important if you are creating an AL that will be used by people who have different native languages. For example, if we were to create compounds as in English (by the simple juxtaposition of two root morphemes) the results will often be interpreted differently by people of different linguistic backgrounds. Also, it's important to note that the methods we are using here were not developed for the purpose of creating compounds. In fact, we did not develop ANY special techniques in this section on compounding. All of the techniques that we used to create compounds are the basic derivational processes we've been using all along. This is highly beneficial because it FORCES the word designer to create and use root morphemes systematically, rather than idiosyncratically. For example, if the word designer were to borrow the English compounding system, then he would almost certainly want to borrow many of the English compounds themselves. The net result would be a clone of the English vocabulary, and speakers of other languages would often be puzzled, confused, and even frustrated by the choices made. However, by making full use of a rich derivational morphology, we can completley eliminate idiosyncracy, and rules intended exclusively for compounding are simply not necessary. It will also make syntactic parsing MUCH simpler, and, if the syntax is designed properly, syntactic ambiguity can also be completely eliminated. 24.2 COMPOUNDS FROM OTHER DERIVATIONAL MORPHEMES As we've already seen, derivation of basic nouns is actually very similar to compounding. For example, when we derived nouns from the root meaning 'water', we paraphrased them using expressions such as "water bug" and "water energy". In fact, because of the nature of the classificational system we are using here, all basic nouns are pseudo-compounds in which the classifier plays the role of a semantically precise 'component' or 'headword'. To make this approach even more flexible, we can extend it by using both verb AND noun classifiers in the same word. For example, we can combine the verb "teyomisi" meaning 'to study' with the 'time' noun classifier "-be-" to create the word "teyomibeda" meaning 'study period'. Since natural languages contain many more nouns than verbs, we may want to increase the productivity of our morphology by creating even more noun classes. For example, we could add several new animal categories. We could split the plant categories into more precise sub-categories. We could divide up the various artifact categories based on how the objects are created or used, such as 'vehicles', 'tools', 'works of art', etc. CCMs and MCMs can also be used to create words that, in other languages, are often implemented as compounds. We've already seen many of these. Here are some examples: xumpifagida -> snowflake (count CCM "-gi-") xumpijigeda -> snowstorm ("high" scalar MCM "-ge-") xumpijisoda -> snow flurries ("low" scalar MCM "-so-") teyomidejazmida -> subject matter (middle CCM "-de-", and mass CCM "-jazmi-") teyokoveda -> teaching ability (quality/ability CCM "-ve-") tencipiapada -> intellectual growth (process CCM "-pa-") mesuabosi -> to get together (reciprocal CCM "-bo-") In fact, it would be very useful to create several more classifiers that represent concepts that are commonly used in compounding. We could call these morphemes _compounding classifiers_. Here are some concepts that we might want to implement as compounding classifiers, along with the actual morphemes that we will use in the sample language: room -kai- eg. bedroom, kitchen, concert hall building/residence -moi- eg. doghouse, boathouse, museum, temple shop/business -xempi- eg. butcher shop, bakery, bookstore measurement/ detection device -mesko- eg. thermometer, scale, microphone tool/implement -ca- eg. screwdriver, hammer, fork, scissors vehicle -gau- eg. bicycle, automobile, rickshaw, boat As an example, when we discussed how to change basic verbs into nouns and adjectives, we briefly mentioned that verbs with "instrumental" subjects could be quite useful. Now that we have the tool/implement classifier, we can elaborate: from A/P/F-d "teyokosi" = 'to teach', we can derive: "teyokocada" = teaching materials "teyokocano" = pedagogical/educational from A/P/F-p "teyoniosi" = 'to instruct', we can derive: "teyoniocano" = instructive from AP/F-s "teyofisi" = 'to review' "teyoficada" = review materials from AP/F-d "teyosuasi" = 'to self-teach' "teyosuacada" = self-study materials from AP/F-p "teyomisi" = 'to study' "teyomicada" = study materials "teyomicano" = heuristic Words meaning 'room', 'building', 'place of business', etc. can be created by simply using the classifiers as roots. For example, the word meaning 'room' is "kaida" and the word meaning 'vehicle' is "gauda". In summary, I don't feel that it's necessary to add any new morpho-lexical features to the sample language to handle compounds. Any compound that is needed can be created easily and precisely using the existing derivational techniques. 24.3 MNEMONIC DERIVATIONS Some compounds are not semantically precise, but actually refer to a subset of entities within a class. In other words, the compound actually describes more entities than it is intended to represent. For example, we might be tempted to create the adjective+noun compound with the literal meaning 'black bear' to represent the species 'Black bear'. However, this would be incorrect, since 'black bear' can apply to any bears that are black in color, even those that are not members of the species 'Black bear'. Because of this, a normal compound cannot be used. What we need is a way to make a distinction between normal, semantically precise compounds and mnemonic compounds. In the sample language, we will accomplish this by changing the noun terminator from "-da" to "-dawe", the adjective terminator from "-no" to "-nowe", the verb terminator from "-si" to "-siwe", and the adverb terminator from "-pe" to "-pewe" for derivations that refer to distinct concepts that are over-described by normal derivation. The new terminator will be used on the headword of a normally formed compound. For example, if the word for 'bear' is "hayumoda", and the root for "black" is "-xava-" (default class = P-s), then the words "xavano hayumoda" can be applied to any bear that is black in color, while the mnemonic compound "xavano hayumodawe" will refer only to members of the species 'Black bear'. With this approach, we are providing ourselves with the ability to use normal compounding techniques where we feel that a simple basic noun is inappropriate. [Note that mnemonic derivations do not have to be limited to compounds - they can also be used for single words - although I can't think of any examples at the moment. Also, keep in mind that a mnemonic compound such as "xavano hayumodawe" is only likely to be used once. After it is introduced, the compound anaphor "hayuxaha" can be used for the remainder of the discourse.] Later, I will discuss a consistent and objective approach for naming species that uses both normal basic nouns and mnemonic compounds. 25.0 TOPICALIZATION We've already discussed two ways in which an argument of a verb can be 'topicalized' or made more salient than other arguments. In this section, I will discuss and summarize all of the various degrees of topicalization that an AL will need. Topical constructions add emphasis and sometimes contrast over and above the normal topicalization indicated by argument structure. In natural language, there are basically four degrees of topicalization: 1. Normal topicalization. Topicalization is indicated by the basic argument structure of the verb; i.e. a subject is more topical than an object or an oblique argument. In some languages, especially those with an anti-passive construction, objects are more topical than oblique arguments. (English does not seem to make a distinction in topicality between objects and obliques. This view is supported by the fact that so many English verbs are inherently anti-passive but do not have active counterparts with clear differences in topicality; e.g. "to listen to", "to talk to", "to look at/for/up", "to wink/shout/laugh at", "to complain", etc.) 2. Contrasting topicalization. Topicalization provides both emphasis and contrast. Here are some English examples: It's John who killed the chicken. It's the chicken that John killed. It's killing that John did to the chicken. A chicken is what John killed OR What John killed is a chicken. 3. Heavy topicalization. An argument of the verb is made more topical than the subject. Here are some English examples: Bill, I saw him yesterday. The new amusement park, it opens for business today. On Sunday, I plan to relax all day. With his new suit, he can attend the conference without embarrassment. 4. Reference-switching. A new entity is introduced into the conversation and singled out for special attention. Here are some examples: As for the chair, John broke it. As regards John, he left in disgust. As far as the meeting is concerned, I decided not to attend. With regard to the delays, I assure you they won't happen again. Normal topicalization is an inherent part of the verbal derivational system that we are discussing in this monograph. This system is not only perfectly regular, but it allows us to create four sub-degrees of topicality (subject vs. object vs. expressable oblique vs. inexpressible oblique). In contrast, most European languages provide only two or three sub-degrees, while typically displaying a considerable amount of idiosyncracy. The second kind of topicalization is used to add both emphasis and contrast to an argument of a verb. English is somewhat unusual among the world's languages in implementing this function using cleft sentences. Most languages achieve this function by somehow marking the item with an inflection or particle and leaving the item in its normal position in the sentence. In our sample language, this emphasis/contrast function is performed in the more typical way; i.e. by using modal particles (or their derivatives), as we discussed earlier. The third type of topicalization, heavy topicalization, focuses the listener's attention on a particular argument of the verb. In effect, it makes the argument even more topical than a normal subject. Most natural languages, including English, accomplish heavy topicalization by a process called _left dislocation_; i.e. by moving the emphasized argument out of the sentence and placing it before the sentence. In addition, an anaphor of the moved item normally appears in the original position in the sentence if the moved item is a core argument of the verb. Thus, in English: The Smiths, THEY left early. Here, "the Smiths" is left-dislocated and the anaphor "they" takes its place in the sentence. In addition to the dislocation, languages mark the emphasized item either by an explicit marker, such as a particle, by a change in stress and timing, or both. Left-dislocation seems to be the way that most natural languages implement heavy topicalization. In fact, I have not been able to find a single example of a language that implements this function differently. Also, in most (if not all) languages, an anaphor of the dislocated item occupies the original position in the sentence if the dislocated item is a core argument. Thus, I suggest that the same approach be used in an AL. In our sample language, I will reserve the particle "pika" for this purpose. Here are some examples: Pika guasuda, the sailors ate guaha. = The duck, the sailors ate it. Pika on Sunday, I plan to relax all day. = On Sunday, I plan to relax all day. And so on. Note that we could have used the deictic "sestuda" (meaning 'it' or 'they') instead of the anaphor "guaha". However, as we discussed earlier, the use of deictics can result in ambiguity if the referent of the deictic is not obvious from context. The fourth kind of topicalization, referent-switching, introduces or re- introduces an entity into the conversation, and singles it out for special attention. This is also normally implemented as a type of left-dislocation, since the argument is moved to the left of the sentence and the gap in the main sentence is almost always filled with an anaphor of the moved argument. In English, this is usually accomplished with phrases such as "as for", "with regards to", "as far as X is concerned", etc. In our sample language, I will use the particle "boka" for this purpose: Boka John, I think the boss is going to fire him. = As far as John is concerned, I think the boss is going to fire him. Boka the new employee, I think he'll do very well. = As for the new employee, I think he'll do very well. Note that both "pika" and "boka" require that a complete sentence immediately follow their argument. 26.0 PROPER NOUNS AND VOCATIVES Proper nouns are the names of individual people, places, and things. However, what is considered a proper noun can differ from language to language. Here is the precise definition that we will use for the sample language: A proper noun is a word that is used to refer to one or more specific, unique representatives of a more general category designated by a basic noun. Thus, using the above definition, words such as "Atlantic", "Johnson", "IBM", "Christianity", "Caucasian", "1996", and "USA" are all proper nouns. They are unique instances of, respectively, the following common nouns: "ocean", "person", "corporation", "religion", "race", "year", and "nation". There's more than one way to deal with proper nouns, but I feel that the best approach is to reserve several terminators for proper nouns and their verb, adjective, and adverb forms. Roots, classifiers, CCMs, and MCMs will NOT have semantic significance, but they may be used for their mnemonic value. In addition, consonants and vowels that are not part of the sample language may be used in proper nouns, and the rules limiting the forms of vowel and consonant clusters can be ignored. Here are the terminators we will use in the sample language: proper noun -daya proper adjective -noya proper verb -sia proper adverb -peya Here are some examples: Boston - Bostodaya New York - Nuyokedaya [Note that we can not use "Nuyokadaya" because "ka" is an terminator.] John/Jonathan - Jonadaya Johnny - Jonacaudaya ("-cau-" is the 'informal' MCM) Richard - Ricadaya Louise - Luisadaya Michael - Mikodaya Anderson - Nandersodaya [In the sample language, a word cannot begin with a vowel.] Franklin Delano - Franklinoya Delanoya Roosevelt Rosaveltadaya [Note that we can not use "Delanonoya" because "no" is a terminator.] Boeing - Bowingadaya Democratic (party) - Demokratinoya France - Fransadaya Japan - Nipodaya Detroit - Detrodaya Obviously, the above approach does not allow further derivation. However, we can make one exception to our normal rules as follows: The non-noun forms of all proper nouns will be genitive. This should not cause any problems since there is rarely a need to make a distinction between alienable and inalienable possession using proper nouns. (Note that this is also the approach we took with anaphora.) Thus, the word "Niponoya" would mean 'Japanese', since it refers to anything that is associated with Japan. Similarly, "Niposia" means 'to be Japanese' and "Bostosia" means 'to be Bostonian'. The adverbial form will be class "0" by default, and will be especially useful with temporal and locative expressions (more below). A proper noun can be modified by adjectives to indicate titles. Here's an example: teyokoda - 'teacher' teyokogeda - 'professor' ("-ge-" = augmentative MCM) Nandersodaya - 'Anderson' teyokogeno Nandersodaya - 'Professor Anderson' Note that the above literally means 'Anderson who is a professor' or simply 'Anderson the professor'. In the same way, the adjective form of a modifying noun can be used to create a proper name with a more specific meaning. For example: kenaida - 'mountain' Verestedaya - 'Everest' kenaino Verestedaya - 'Mount Everest' Note that the above example literally means 'Everest the mountain'. To signify a looser connection between the common and proper nouns, we could use the open noun form of the common noun instead of the adjective form: kenaigiu Verestedaya - 'Mount Everest' Literally, the above example means 'the mountain of Everest' or 'the mountain associated with Everest'. However, I am not entirely happy with the second approach. It seems to me that the headword of a proper compound should itself be a proper noun. An expression such as 'the mountain of Everest' is not a true proper compound because it could conceivably apply to more than one mountain. Note that either approach can be used to name items such as "The Eiffel Tower", "The Sea of Japan", "Ockham's Razor", etc. They can also be used for proper compounds in which neither component is a proper noun, such as "The White House", "The Grand Prix", and "The Liberty Bell". For example, "The Liberty Bell" could be implemented as "Liberty the bell", where "bell" is simply the adjective form of the common noun meaning 'bell', and "Liberty" is the noun meaning 'liberty' terminated by "-daya" instead of "-da". Conventions can also be adopted that apply to proper nouns that come in groups. For example, days of the week can all have the form "DeXXXdaya", where the sub-string XXX is a numeric morpheme: Defedaya - Monday ("-fe-" = numeric 'one') Dedudaya - Tuesday ("-du-" = numeric 'two') Dezidaya - Wednesday ("-zi-" = numeric 'three') And so on. Conventions can also be adopted for months of the year, the years themselves (e.g. "1996"), letters of the alphabet, stellar constellations, etc. The adverbial forms will be most useful; e.g. "Dezipeya" = '(on) Wednesday', "Bostopeya" = 'in Boston', etc. A special part-of-speech terminator can be used for vocatives. In the sample language, I will use "-vu" for common vocatives and "-vuya" for proper vocatives: Ricavuya, come here! = Richard, come here! Teyokogevu, may I have a word with you? = Professor, may I have a word with you? Teyokogeno Nandersovuya, I won't be able to attend the seminar. = Professor Anderson, I won't be able to attend the seminar. Vu, where are you going? = Say there, where are you going? Note that a vocative is like a noun because it can be modified by an adjective, an open adjective, or even a relative clause. It functions, however, as a complete, stand-alone sentence. 27.0 CHOOSING PRIMITIVES: VOCABULARY DESIGN STRATEGY In this section, I would like to discuss a strategy that you can use to design the vocabulary of an AL. This strategy will not contain specific rules or procedures, but instead, is aimed at providing an overall philosophy or set of guidelines. (I will discuss a more specific methodology in the next section.) 27.1 DESIGNING BASIC VERBS Very early in this monograph, we decomposed the verb meaning 'to know' into a root concept and an argument structure. We then applied all other possible argument structures to the same root. This process resulted in many unexpected and extremely useful derivations. The number of useful derivations increased even more as we applied CCMs and MCMs. What I find most gratifying about the process is that many of the derivations are truly surprising. For example, while I felt that the concepts underlying the words "forgive" and "apologize" were related, I would not have expected to be able to derive both words from the same root. With the above in mind, I have defined four principles of verb design: 1. Start with simple, common verbs. Isolate their root concepts and apply ALL classifiers. Appropriate CCMs should be used when related verbs have different argument structures (e.g. "to say" vs. "to tell") In the process, the vast majority of less common verbs will be automatically derived. This principle also applies to numeric, deictic, tense/aspect, and modal concepts. 2. Keep in mind the inherent difference between basic state concepts and modal concepts. Always test new concepts to determine if they are modal. 3. Postpone derivation of actions until all state verbs have been created. Many action verbs are actually "-p" derivations of state concepts. 4. If you have difficulty defining a basic state or modality, or if it has limited usefulness when combined with most classifiers, CCMs, and MCMs, it is very likely that the state is not very basic. 5. Always be suspicious of roots that represent energetic states. Many of these concepts can actually be derived from non- energetic states that end up being much more productive. The fourth principle is the most difficult to apply, since the nature of the more basic state or modality may not be obvious. In a situation like this, postpone derivation of the particular verb or modal. There's a good chance that the desired word will be derivable from a different root concept that you haven't yet defined. Another tactic is to examine words that have similar meanings (a thesaurus can be very useful for this), or to create a few paraphrases of a sentence that uses the word. For example, how do we deal with the verb "to establish"? He established his innocence. He proved his innocence. He convinced others of his innocence. "He" = agent "others" = patient "his innocence" = focus Thus, "to establish/prove" is simply the A/F-d [+P] (i.e. anti-passive) derivation of the A/P/F-d verb meaning 'to convince (of/that)'. And, as we saw earlier, the verb "to convince" is actually derived from the probability modality. Thus, this sense of the English word meaning 'to establish/prove' is represented by the word "pintekogasi" in the sample language. Incidentally, by now it shouldn't be too surprising that obscure grammatical voice operations such as anti-passive, inverse, obviative, etc. can produce so many useful words. Languages that do not have these voice operations must instead use unique root morphemes, periphrasis, metaphors, or even idioms. Because of this, it is important to constantly keep these 'obscure' derivations in mind, especially when you run into difficulties. There are many hidden surprises in such a powerful derivational system as the one proposed here. As an illustration of the fifth principle, consider the word meaning 'to search for' or 'to look for'. If we derive this verb from the complex, energetic state meaning 'searching', then almost none of the other derivations will be very useful. In a situation like this, it's often helpful to create a short dialogue that uses the word in a realistic context. The dialogue may contain other words which can be derived from the required root which, in turn, could provide clues about the nature of the state we're looking for. Here's a sample dialogue: John LOST his watch. It's been MISSING for three days. He said he'll keep SEARCHING until he FINDS it. If we can define a state concept for the P/F-d verb "find", we can then derive the AP/F-p action counterpart meaning 'agent attempts to find focus' = 'to search for'. But what is the state concept for the verb "find"? It appears to be the state meaning 'knowledge of a location'; i.e., to 'find' something means to determine its location. Earlier, in the section on counts and measures, we created the CCM "-vie-", which converted a state to a concept meaning 'to measure or determine the state'. This, of course, is exactly what we need here. The verb meaning 'to look for' can be paraphrased as 'to attempt to determine the location of'. And the state root meaning 'location' is "-me-", the same root we used to derive several locative case tags and verbs. Here are some useful derivations using this root and the CCM "-vie-" (default class = AP/F-d): AP/F-d: "meviesi" 'to (seek and) find', 'to locate', 'to determine the location of' AP/F-p: "meviemisi" 'to look for', 'to search for', 'to seek' P/F-d: "meviedosi" 'to happen upon', 'to discover', 'to stumble across', 'to accidentally find' P/F-s: "meviemasi" 'to know the location of' F-p [+AP] adj: "mevieminuno" 'sought after' P/F-d: "menaviedosi" 'to lose', 'to lose track of' Thus, it's important to be especially careful with words which seem to imply energetic states, but in which the agent tries to obtain (or successfully obtains) a clearly defined goal or end point. In fact, without such a goal, the act would be useless. True energetic states, such as "to jog", "to sing", "to play", "to swim", "to twinkle", and so on do not incorporate a pre-defined goal into their meaning. Instead, the activity itself is useful, desirable, or natural. 27.2 DESIGNING BASIC NOUNS In natural languages, basic nouns far outnumber basic verbs. I suspect that this is so because Mother Nature and human ingenuity have provided us with many unique 'things', and humans have created many unique names for them. However, we seem to describe the WAYS these 'things' interact using a much smaller vocabulary. Fortunately, the system proposed here is eminently qualified to deal with this difference in relative numbers, because it inherently allows us to create more basic nouns from a particular root than basic verbs. This is true for two reasons: First, the design of basic nouns is more flexible because the roots are used for their mnemonic value (which can be vague or even metaphoric) rather than for their semantically precise meanings. Second, as AL designers, we are free to create as many noun classes as we feel are necessary, which will allow us to derive even more basic nouns from a particular root. What this implies to me is that our top priority should always be to derive most (if not all) basic verbs first. Derivation of nouns should be postponed until later. However, since some nouns may be needed for testing, they should be derived tentatively, and only for the most obvious and harmonious combinations of root-plus-classifier. Once you have compiled a comprehensive list of state/action/modal concepts, you can then start matching them up with appropriate real-world entities. 27.3 SINGLE WORDS OR COMPOUNDS? When designing your vocabulary, you will often have to ask yourself whether a concept should be implemented as a single word or as a compound. Natural languages differ considerably in this respect. For example, English has unique unrelated words meaning 'mouse' and 'rat', while Japanese does not. On the other hand, Swahili has unique, unrelated words for 'soldier ant', 'white ant', and 'brown ant', whereas English forms compounds. Obviously, the word designer will be heavily influenced by his native language, and may unintentionally copy it. In order to avoid this inherent kind of bias, I suggest the following guidelines for living noun classes: For the living noun classes, a single word should be created for each biological category (phylum, order, class, family, or genus) that is linguistically useful; i.e., which is likely to have a single-word representation in a natural language. A single word may also be used to represent a super-category consisting of more than one category, if the categories are similar enough, and if a natural language is unlikely to differentiate between them. For sub-categories (such as individual species) within a category or super-category, a descriptive mnemonic compound should be created. For extremely common sub-categories, a unique common noun can be created. To illustrate the first guideline, consider the following chart: Common name Family Genus & species ------------------------------------------------------- Arctic fox Canidae Alorex lagopus Bat-eared fox Canidae Otocyon megalotis Bushdog Canidae Speothos venaticus Cape hunting dog Canidae Lycaon pictus Coyote Canidae Canis latrans Crab-eating fox Canidae Cerdocyon Thous Dingo Canidae Canis familiaris dingo Dog Canidae Canis familiaris Grey or Timber wolf Canidae Canis lupis Raccoon dog Canidae Nyctereutes procyonoides Red fox Canidae Vulpes vulpes As you can see, there is very little consistency in the English names. Using the above guidelines, we would allocate a single word for all members of family Canidae. In the sample language, we will use the action root "-bawa-", meaning to 'bark'. Thus, the basic noun "bawamoda" would refer to any canine, such as 'dog', 'fox', or 'wolf', and the adjective "bawamono" would be equivalent to the English adjective meaning 'canine'. Now, if the proper noun for 'Arctic' is "Nartikidaya", we could create the mnemonic compound "Nartikinoya bawamodawe" for 'Arctic fox'. If the root meaning 'grey' is '-muzge-' (default class = P-s), then the compound "muzgeno bawamodawe" would mean 'Grey wolf'. (Note that this is the same approach we used earlier to derive the mnemonic compound meaning 'Black bear'.) For canis familiaris, we need to allocate a unique common noun. In the sample language, we will use the state root "-bue-" to represent the relationship meaning 'friend' (default class = P/F-s). Thus, the full name for 'dog' is "bueno bawamodawe", and the simple common noun is "buemoda". We could also create a macro that includes more than one species. For example, we could create a single macro to represent all species that we think are 'wolf-like', and which would correspond in meaning to the English word "wolf". The problem, though, is that this is inherently arbitrary and you will NEVER find agreement among all natural languages on how such divisions should be made. And if you do it for English words such as "wolf" and "fox", then, in all fairness, you must accept the impossible task of doing it for all other natural languages as well. For the non-living noun classes, I suggest the following approach: 1. If a combination of verb root plus noun classifier is highly suggestive or mnemonic, then use it. 2. Otherwise, if a concept can be implemented by exactly two simpler words, then use the two-word compound, even if the result is slightly too general. For extremely common concepts, a unique common noun can be created. 3. Otherwise, a single word should be created to represent the concept. Using the above approach, words such as "breadbox", "bookshelf", and "desk" (i.e. 'writing table') will be implemented as compounds, while words such as "window", "computer", and "island" will be implemented as single words. Note that approaches (2) and (3) can also be applied to verbs. By allowing compounds that are slightly more general in meaning than their English counterparts (e.g. 'writing table'), the results are more likely to encompass the meanings of equivalent words in other natural languages. Finally, keep in mind that, in normal usage, a long name such as "Nartikinoya bawamodawe", meaning 'Arctic fox', will be used only once in the text or discourse to introduce the name. From that point on, the compound anaphor "banaha" would be used. 28.0 WORD DESIGN PROCEDURE Throughout this monograph, we have been using paraphrases to define the meanings of particular derivations. These paraphrases are very much like dictionary definitions, although more primitive. Also, there is nothing arbitrary about the choice of words used in each paraphrase - a paraphrase can always be unambiguously generated from the meanings of the component morphemes. This ability to generate paraphrases unambiguously means that the paraphrases can be automatically generated by a computer, which can greatly speed up the word design process. Of course, the word designer will still have to choose the root concepts using the guidelines that we discussed in the previous section. Once a root concept has been chosen, though, a properly programmed computer can then automatically generate paraphrases of the derivations that use the root concept. For example, if we define the following root concept: State paraphrase: 'liquid' Noun paraphrase: 'water' then a computer can automatically generate the following questions: A/P-s: agent maintains patient in a 'liquid' state = ? A/P-d: agent causes patient to become 'liquid' = ? P-d: patient becomes 'liquid' = ? ... Mammal: 'water' mammal = ? ... Energy: 'water' energy = ? And so on. Appropriate paraphrases can be programmed for all classes. If a paraphrase has an equivalent word or fixed expression in the natural language of the AL designer, then the designer can provide the word or expression to the computer, which will automatically create the dictionary. Paraphrases of derivations using modifying morphemes can use the native word instead of continually repeating the more verbose paraphrase. For example, the following would be the paraphrase for the English word "learn": P/F-d: to become more 'knowledgeable' about focus = ? Once the AL designer has provided the word "learn", further derivations can use this word instead of the more verbose paraphrase. Thus, the verb meaning 'to master' would have the following paraphrase: maximum augmentative: to 'learn' to the greatest degree possible = ? instead of the more verbose: maximum augmentative: to become more 'knowledgeable' about focus to the greatest degree possible In general, specific root morphemes should NOT be chosen at this stage of the design. In other words, the AL designer should not create any actual words, but should postpone the selection of actual root morphemes until later. After a large number of words have been defined, statistical tests of the results can be performed to determine the distribution of number-of-roots versus number-of- productive-derivations. This will allow the designer, ultimately, to assign the shortest roots to the most productive root concepts. 28.1 SAMPLE DERIVATION Just for fun, here's a fairly large (but incomplete) set of derivations using the root "ja-", the speech morpheme we discussed earlier. As a root, it represents the speech act meaning 'say/tell/speak' (default class = A/P/F-p). Here goes... Basic Verbs: jasi to tell (eg. I told John a joke.) jagasi to say, utter, express (eg. He said one word to me. "-ga-" = anti-passive CCM) jaxisi to express, give expression to ("-xi-" = anti-middle) jakuasi to speak (eg. He spoke to her about the meeting. "-kua-" = double anti-passive) jamiusi to have a talk with ("-miu-" = anti-anti-passive) jabosi to discuss, to confer about ("-bo-" = reciprocal) jatasi to ask Basic Nouns: jamoda man, human ("-mo-" = mammal) jasuda parrot ("-su-" = bird) japustada dragon ("-pusta-" = reptile) japoda ent, Tolkien's talking tree people ("-po-" = tree) jafiuda speech synthesizer, vocoder(?) ("-fiu-" = non-living, artificial, matter & energy) javauda vocal chords ("-vau-" = living matter) jateda stage ("-te-" = artificial location) jamoida theater ("-moi-" = building/residence) jaboteda forum jakida electric speaker ("-ki-" = artificial item) japaida speech sound/energy ("-pai-" = non-living energy) jacada megaphone ("-ca-" = tool/implement) jameskoda microphone ("-mesko-" = measurement/detection device) jabiuda language/speech community ("-biu-" = abstract group) jaxoda language ("-xo-" = performance) jaloxoda dialect ("-lo-" = similar/like) jatiwada linguistics ("-tiwa-" = field or profession) janeyada linguist ("-neya-" = member of a profession) jabolida dialogue, conversation, discussion ("-li-" = performance component/result) jalida utterance, speech act jajulida phoneme ("-ju-" = 'minimal' MCM) jasolida morpheme ("-so-" = 'not too' MCM) jagelida word ("-ge-" = 'very' MCM) japilida sentence ("-pi-" = 'maximal' MCM) japisenjelida paragraph ("-senje-" = group CCM) [Note that "-senje-" occurs BEFORE the noun classifier, and is thus being used only for its mnemonic value. The alternative "japilisenjeda" means literally 'group of sentences', which is not necessarily a paragraph.] Jabodaya the name of the sample langauge ("-daya" = proper noun terminator) Miscellaneous: jasifnesi to reply, to respond ("-sifne-" = 'back/in return') jakuada statement ("-kua-" = double anti-passive) jasifnekuada reply, response jatakuada question, query japino talkative, garrulous japiloida chatterbox, big mouth ("-loi-" = rude/insulting MCM) jabocausi to shoot the breeze about ("-cau-" = informal MCM) jaboloisi to shoot off one's face about jageliniogasi to verbalize, to put into words, to express in words jabolicauda bull session jaguisi to blurt out (to), to tell/say accidentally ("-gui-" = P/F-p verb classifier) And I'm sure there are many others. Keep in mind that whatever appears before a noun classifier does not have to be semantically precise - it can be used for its mnemonic value. Also, note that since "ja-" is a speech act, non-agentive verb derivations are not very useful. 29.0 USING WORDS: LITERALNESS, POLYSEMY, METAPHOR, AND IDIOM Throughout this monograph, we've seen many examples of derivations whose English counterparts were periphrastic, polysemic, metaphoric, or even idiomatic. In fact, when speakers of natural languages use non-literal language it is almost always because they are forced to do so. They cannot avoid it either because their vocabulary does not have an appropriate literal construction available, or because it is something that the speaker is not comfortable using. This is unfortunate because the way that a non-literal construction will be interpreted will depend very much on the native language and culture of the listener. For example, metaphoric use of the word "pig" can have meanings such as "slob", "sex maniac", or "over-eater" in English, but will have different meanings to speakers of other languages. Also, as we've seen many times throughout this monograph, many metaphors, including the above examples, can be avoided by using appropriate derivations instead. For example, pejorative morphemes can be used to implement the above examples. In fact, I have become completely convinced that a properly derived word can replace ANY required or unavoidable metaphor, and it can never be misinterpreted by native speakers of other languages. Thus, the goal of an AL designer should be to provide the means to say ANYTHING without the need for non-literal language. In other words, metaphor, polysemy, and idiom should be optional - they should NEVER be obligatory. It is also my opinion that non-literal language should be generally avoided (except where its use is obvious to all listeners or readers), since the possibility for misunderstanding is so great. [Since this monograph is about the use of semantic precision in word design, it is not the proper place to discuss non-literal language, and I will say no more about it here. However, if you would like to read more about the dangers of metaphor, see my separate essay entitled "Metaphor".] 30.0 FINAL WORD ON FOCUS At the very beginning of this monograph, I stated that the focus case role is vague and even somewhat "out-of-focus". Furthermore, even our working definition of focus is vague: that the focus is the referent of an actual or potential relationship with the patient. Actually, I don't really think that the above definition is needed, even though I DO believe that it is accurate. In fact, we can come up with a different and perhaps better definition if we look at our primary case roles as sets of binary features, an approach which is often quite useful in linguistics. There are only two features, agent and patient, and there are only four possible combinations: A case role -> +agent, -patient P case role -> -agent, +patient AP case role -> +agent, +patient F case role -> -agent, -patient In other words, the focus case role is the primary case role that is neither agent nor patient. Thus, focus is indeed vague, but it is definitely not ambiguous. Also, even though we have derived many verbs that do not have a focus as part of their argument structure, the simple fact is that ALL verbs are focused to some degree. When a focus is not explicit, it is either incorporated into the meaning of the verb or is too vague, general, or variable to require an explicit form. Because of this, we are able to de-focus words that, on first examination, seem to be inherently focused, such as locatives (e.g. "to stay put" vs. "to stay at"), temporals (e.g. adverb "earlier" vs. case tag "before"), and so on. In each case there is a "default" focus that we all seem to understand intuitively. Sometimes the default focus is the time of the utterance (e.g. the temporal adverbs) or the initial location of the patient (e.g. the locative adverbs), simply because no other interpretation makes sense. At other times the meaning seems to be totally idiosyncratic, and can include any or all possible foci. Unfortunately, aside from some vague and elusive ideas, I do not know how to describe the semantics of these default foci, and I'm not sure it's even possible. At this point in time, I can only provide the following guidelines: If it is truly possible for an unfocused verb to have more than one referent in the provided context, then it DOES have any or all of them, and any or all of these possible referents IS the default focus (e.g. "The country is rich" vs. "The country is rich in copper"). If it is NOT possible for an unfocused verb to have more than one referent in the provided context, then the default focus is the ONE possible referent that makes sense (e.g. "The old man left the house" vs. "The old man went away"). The above is not what I would call 'semantically precise', but it's the best I can do for now. 31.0 DEFAULT VERB CLASSES Several times throughout this monograph, I indicated that certain roots had default verb classes without explaining WHY I chose those particular defaults. In this section, I would like to summarize the basic verb types. In doing so, the reasons for the defaults will become clear. I've already stated that there are two basic verb types: state verbs and action verbs. State verbs are derived from patient-oriented concepts, while action verbs are derived from agent-oriented concepts. However, these two basic types are not sufficient to decide which defaults should be used. We need to sub- divide the basic verb types into more detailed categories. Here is a list of those categories and the default verb classes that I have chosen to use in the sample language: Action verbs: Physical acts: default = A/P-d e.g. to kick, to ram, to slap Speech acts: default = A/P/F-p e.g. to say, to curse, to congratulate Activities: default = AP-s e.g. to sing, to smoke, to swim State verbs: Mental states: default = P/F-s e.g. loving, knowing, fearing, wanting Relational states: default = P/F-s e.g. inside, after, possessing, meaning Scalar states: default = P-s e.g. blue, heavy, intelligent, smelly Numeric states: default = P-s e.g. five, seventh, many Deictic states: default = P-s e.g. this, now, you, here Binary states: default = A/P-d e.g. alive, closed, asleep, broken I chose the above defaults to reflect the argument structures that are most commonly used with the verbal subtypes. For example, activities such as "singing", "playing", and "dancing" are most commonly used as AP-s verbs. When focused, they elaborate the event; i.e. "sing a song", "dance a polka", etc. The focus of an action verb always elaborates the event; i.e., it provides more detail about what the agent is doing. The focus of a mental state elaborates what the mind is doing; i.e., it indicates what the mind is focused on. The focus of a relational state indicates the referent of the state. The focus of a scalar state indicates the actual position of the state on a scale of possibilities; i.e., it elaborates the magnitude of the state (or the change in magnitude for dynamic verbs). The focus of deictic and binary states is almost always incorporated into the verb, but it elaborates the state on the rare occasion when it is actually used. The focus of a numeric state is the larger set from which the more specific quantity is being selected. Note that all modal concepts are mental states, and temporal and locative concepts are relational states. A useful generalization is that the focus of ALL verbs provides greater context for the situation or event. More specifically, the focus of an action verb elaborates the event (i.e., it tells us more about what the agent does), while the focus of a state verb is the referent of the state (i.e., it tells us more about the state of the patient relative to the focus). With the above in mind, we can now present a chart that indicates the default verb classes for all root concepts: Physical acts: A/P-d Speech acts: A/P/F-p Activities: AP-s Register acts: A/P-p Mental states (including modals): P/F-s Relational states (including temporals and locatives): P/F-s Scalar states: P-s Numeric states: P-s Deictic states: P-s Binary states: A/P-d Basic nouns (including abstract nouns): P-s Generic "-ze-": A/P-s All other morphemes (including scalar polarity morphemes): "0" 32.0 SUMMARY: A COMPREHENSIVE LEXICO-SEMANTIC SYSTEM I hope that by now I have convinced the reader of the value of a powerful derivational system. I cannot emphasize too much that a system like the one that I'm proposing here will maximize the neutrality of the vocabulary of an AL, while almost completely eliminating the need for ad hoc and arbitrary word creation. It will also reduce to an absolute minimum the number of morphemes that a student of the language will have to memorize. One of the greatest difficulties in learning a new language is mastering the idiosyncracies of the vocabulary. This is so because a word in one language rarely means exactly the same thing as its closest counterpart in a different language. In other words, the "semantic space" of each word in a natural language is arbitrary - the result of centuries of evolution and accident. In effect, each word of a natural language has built-in irregularities that the student must learn. Unfortunately, most AL designers unwittingly clone their native vocabulary, not realizing the difficulty that will be faced by potential students of the AL. The net result is that the meaning of a word cannot be deduced from more basic and universal concepts that have the same meaning for everyone, but instead depends almost exclusively on its meaning in only one natural language - the native language of the AL designer. In such an AL, the semantic space of each word is arbitrary, and mastering the idiosyncracies of the entire vocabulary can take years of effort. Thus, different speakers WILL use the words differently, and misunderstandings WILL occur because there are no rules that can be followed to determine the precise semantic space of a word. Instead, each speaker will use the word in the same way he would use the closest equivalent in his native language. In the system proposed here, the semantic space of each word is precisely defined in terms of the much more basic meanings of the morphemes that make up each word. And while there may be some arbitrariness in the selection of the root concepts, the overall arbitrariness of the entire vocabulary will be much, much less. Thus, even though we may never be able to achieve true neutrality, we can certainly come very close. APPENDIX A: THE PHONOLOGY AND MORPHOLOGY OF THE SAMPLE LANGUAGE Word ::= { Morpheme } + Part-of-speech Morpheme ::= C D | C V { X } Part-of-speech ::= Terminator Terminator ::= bie | cu | da | dawe | daya | di | fo | giu | ha | he | hi | ho | je | jewe | jeya | ka | nia | no | nowe | noya | pe | pewe | peya | si | sia | siwe | tiu | vu | vue | vuya C = any consonant (p, b, t, d, k, g, c, j, l, m, n, f, v, s, z, x) D = any diphthong (ai, au, eu, ia, ie, io, iu, oi, ua, ue, ui, uo) V = any vowel (a, e, i, o, u) S = any semi-vowel (w, y) X = extension = S V | C C V | = logical 'or' {} = enclosed item may appear zero or more times Lower case letters represent themselves Pronounce vowels as in Italian, Swahili, or Japanese (i.e. the five cardinal vowels /a/, /e/, /i/, /o/, and /u/). Pronounce consonants as in English, except for the following: "c" is like "ch" in "church", "x" is like "sh" in "shop", and "q" is like "s" in "measure". [The consonant "q" is not very common among the world's languages, but I included it to maintain the balance between voiced and unvoiced consonants. I would NOT use it unless it becomes absolutely necessary.] The consonant "n" may be pronounced as a velar nasal (like "ng" in "sing") before "g" and "k". Pronounce 'y' as in 'royal' and 'w' as in 'awake'. A word with the above morphology can always be parsed unambiguously into its component morphemes, and a stream of words can always be divided unambiguously into individual words even if there are no spaces between words. Thus, the boundaries between morphemes and words is never in doubt. This feature of word morphology is usually referred to as either _self-segregation_ or _auto- isolation_. Because of the self-segregation rules, a morpheme cannot have the form of a terminator. For example, a root with the form "-da-" is illegal because "da" is a terminator. However, "-daye-", "-daspo-", and "-sunda-" ARE legal because the extensions "-ye-", "-spo-", and "-nda-" create unique morphemes. [Note that I do not consider a terminator to be a morpheme, even though it has the form of a morpheme. This may not be technically correct, but it is useful for our purposes.] There are limitations on which vowels may be juxtaposed in a vowel cluster and on which consonants may be juxtaposed in a consonant cluster. As a general rule, NO geminate phonemes are allowed. Thus, "dd", "aa", "nn", and so on can never occur in the language. Only the following vowel clusters are allowed: ai au eu ia ie io iu oi ua ue ui uo In the above clusters, the vowels "i" and "u" may be optionally pronounced like the semivowels "y" and "w", respectively (pronounce "ui" as /wi/, and "iu" as /yu/). Additional vowel nuclei can be formed by placing a semi-vowel between two vowels. For example, we cannot have "ea" but we can have "eya", and we cannot have "oa" but we can have "owa". However, "y" may never be followed by "i" and "w" may never be followed by "u". Thus, combinations such as "oyi" and "awu" are forbidden. If "y" is inserted after "e" or "i", or "w" is inserted after "o" or "u" in one of the allowable vowel clusters, then the result will be an allophone with exactly the same meaning. Thus, "iyu" is identical to "iu", "uwa" is identical to "ua", "eyu" is identical to "eu", and so on. In effect, a diphthong is simply a VSV in which the semi-vowel has been deleted; the semi-vowel 'w' is deleted if either V is 'u', and the semi-vowel 'y' is deleted if either V is 'i'. Thus, for example, "-io-" is actually "-iyo-", "-eu-" is actually "-ewu-", "-ui-" is actually "-uwi-", "-iu-" is actually "-iyu-", and so on. For consonant clusters, ALL combinations of exactly two consonants are allowed except as follows: 1. Only one of the consonants in a cluster may be a stop or an affricate. For example, "km", "nt", and "zg" are allowed while "kc", "pt", and "db" are not allowed. 2. Only one of the consonants in a cluster may be a fricative. For this test, affricates are considered to be fricatives. For example, "sk" and "dz" are allowed while "sc", "cx", and "zj" are not allowed. 3. Both consonants must have the same voicing unless one of them is "l", "m", or "n". Thus, "nc", "st", "px", and "gv" are allowed, but "sd", "bx", and "kv" are not allowed. 4. The clusters "tx" and "dq" are not allowed because they can be easily confused with the affricates "c" and "j". 5. The consonant "h" may never be the first consonant in a cluster, and may only appear after "l" or "n". Stress is not necessary for proper understanding. However, for the sake of consistency, it is recommended that stress be applied according to the following rules: 1. A syllable is defined such that any occurrence of CV or SV is the beginning of a new syllable. Thus, a syllable can have one of five forms: CV, CVV, CVC, SV, and SVC. Examples: da, fe/wa/da, ki/di, Po/da/ya, do/sau/pe, gam/bu/ma/yas/ti/no, can/do/ka, etc. Note that a syllable boundary may appear within a morpheme, but that the start of a morpheme is always the start of a syllable. 2. All words should be stressed on the next-to-last syllable (i.e. penultimate stress). 3. If a word contains more than four syllables, then also stress the second syllable. 4. Stress can be applied as greater volume, higher pitch, longer duration, or any combination thereof. Default classes of roots depend on their meaning as follows: Physical acts: A/P-d Speech acts: A/P/F-p Activities: AP-s Register acts: A/P-p Mental states (including modals): P/F-s Relational states (including temporals and locatives): P/F-s Scalar states: P-s Numeric states: P-s Deictic states: P-s Binary states: A/P-d Basic nouns (including abstract nouns): P-s Generic "-ze-": A/P-s All other morphemes (including scalar polarity morphemes): "0" Once a class is assigned, either explicitly or by default, only a classifier or class-changing morpheme can change it. For example, if a word consists of exactly two root morphemes and a terminator, then the first root will provide the class. Previous-word modifiers have the property of modifying an entire syntactic constituent when they are applied to the head word of a constituent. For example, when a previous-word modifier follows a noun, it will modify the entire noun phrase. APPENDIX B: MORPHEMES OF THE SAMPLE LANGUAGE This appendix contains a complete list of all of the morphemes that were created in this monograph, including terminators. Terminators: Verb: -si- Proper verb: -sia- Mnemonic verb: -siwe- Imperative: -cu- Adverb/case tag: -pe- Proper adverb: -peya- Mnemonic adv: -pewe- Noun: -da- Open noun: -giu- Proper noun: -daya- Mnemonic noun: -dawe- Adjective: -no- Open adjective: -bie- Proper adj: -noya- Mnemonic adj: -nowe- Previous-word modifier: -di- Open PWM: -nia- Vocative: -vu- Proper voc: -vuya- Mnemonic voc: -vue- Modal/tense/aspect/ disjunct: -fo- Particles: -ka- Anaphora: -ha- -he- -hi- -ho- Reserved: -je- -jewe- -jeya- -tiu- Verb Classifiers: A/P/F-s: -tue- A/P/F-d: -ko- A/P/F-p: -nio- A/P-s: -zoya- A/P-d: -pu- A/P-p: -ce- AP/F-s: -fi- AP/F-d: -sua- AP/F-p: -mi- AP-s: -panji- AP-d: -za- AP-p: -diu- P/F-s: -ma- P/F-d: -do- P/F-p: -gui- P-s: -se- P-d: -pia- P-p: -moncu- 0/A: -fia- 0/AP: -piu- 0/P: -gu- 0/F: -jo- 0: -la- Noun Classifiers: matter & energy: living, animals -nembi- vertebrates, mammals -mo- birds -su- reptiles -pusta- fish -sai- insects (all arthropods) -zio- living, plants -kaya- trees & shrubs -po- other spermatophytes -tonze- non-living, natural -ji- non-living, artificial -fiu- matter: living -vau- non-living, natural, substance -fa- locative -nai- other -le- non-living, artificial, substance -niu- locative -te- other -ki- energy: living -dengi- non-living -pai- time: -be- Abstract Nouns: -ta- measurements -biu- groups/organizations -xo- performances -li- performance components and results -tiwa- fields of endeavor -vo- field components (i.e. schools) and results (i.e. styles) -neya- member of a profession (= -tiwa-panji-) Compounding Classifiers: -kai- room -moi- building/residence -xempi- shop/business -mesko- measuring device -ca- tool/implement -gau- vehicle Class-Changing Morphemes: -na- not, other than -de- middle -xi- anti-middle -gue- anti-anti-middle -voi- double middle -ceu- double anti-middle -nu- passive -ga- anti-passive -miu- anti-anti-passive -jau- double passive -kua- double anti-passive -vi- inverse ( A/P/F-x -> P/A/F-x ) -viga- obviative ( P/F-x -> F-x [+P] allowing case tag to bind to focal subject ) -ne- cosubject ( demotes part of the subject and makes it obliquely expressable ) -sau- non-subject ( an entity is specifically excluded from being subject ) -gi- convert to count noun -jazmi- convert to mass noun -senje- convert to group noun -ve- essential quality or ability -pa- process -mante- process result or product -xa- genitive (result class = P-s) -tu- reflexive -bo- reciprocal ( A/P/F -> A=P/F ) -pasku- P=F reciprocal ( A/P/F -> A/P=F ) -fu- infinitive, same subject as outer verb -vua- make all arguments of a verb oblique -vie- determine/measure state (result class = AP/F-d) Scalar Polarity Roots/MCMs: -lau- too, excessively -pi- maximally, extremely -ge- very, highly -xe- ??? midpoint, average, so-so ??? -so- not too, not very -ju- minimally, barely, hardly -zunda- almost, not quite -pinte- definitely, absolutely (= 100% epistemic probability) -junte- not at all, not ... whatsoever (= 0% epistemic probability) -pusli- just, only -ku- interrogative Register Roots/MCMs: -xemna- fawning, groveling, subservient -tenko- humble -mio- polite, very formal -zai- formal -cau- informal, slang -loi- contemptuous, rude, insulting -pie- vulgar, filthy, tasteless -xua- macho -xesmi- effeminate Numeric roots/MCMs: A basic number will have the following format: ( radix ) + ( minus sign ) + [ digit ] + ( decimal point + [digit] ) + ( exponent + (minus sign) + [digit] ) + ( ordinality ) + part-of-speech Here are the number-forming morphemes: -heksi- hexadecimal radix (default = base ten) -minsu- minus sign (default = positive) -zeyo- zero -fe- one -du- two -zi- three -kau- four -poi- five -bua- six -vastu- seven -ketsa- eight -go- nine -dai- ten -senti- hundred -kio- thousand -milni- million -maya- A hex = 10 decimal -biwi- B hex = 11 decimal -cawa- C hex = 12 decimal -doyo- D hex = 13 decimal -neye- E hex = 14 decimal -fuyu- F hex = 15 decimal -divde- divider, X/Y -fevde- divider, 1/X -cuye- decimal point -jinta- exponent -hu- negative exponent -vevna- real/imaginary separator -kawa- N at a time, N per group, in groups of N -saksi- all, the whole amount -mai- many, much, a lot, a large amount -xandu- not too many, not too much -pewa- few, little, a small amount -zonja- any, positive non-zero, one or more, greater than zero Ordinality: -- cardinal (this is the default) -xunga- ordinal Deictics: 1: mi- Pers: -st- 2: du- Sing: -a- 3: se- Dem: -mp- 1+2: ci- Plur: -i- 1+3: be- Loc: -ng- 2+3: fa- Unspec: -u- 1+2+3: po- Tem: -lk- Tense/aspect: Tense Aspect ----- ------ Past: -lu- Perfect: -- (default) Present: -co- Imperfect: -nsa- Future: -ti- Iterative: -mpo- Unspecified: -ba- Habitual: -ntu- Inceptive: -spi- Continuative: -mbe- Terminative: -nzi- Completive: -ksu- Reserved: -ple- Reserved: -lto- Unspecified: -nda- Modals: Degree Modality ------------------------------------------------------------------- 100% pi- Probability (epistemic) -nte- High ge- Evidentiality (epistemic) -sna- Low so- Adequacy (epistemic) -ngo- Very low ju- Significance (epistemic) -mbe- 0% na- Obligation (deontic) -ndu- Undefined xe- Inevitability (deontic) -sko- Interrogative ku- Necessity (deontic) -tsi- Importance (deontic) -spu- Speaker-oriented obligation: -nka- Other Roots/MCMs: -bawa- bark, howl (action) -benzo- closed/shut/unopened (binary state) -bue- friend (relational state) -denga- dirty (scalar state) -gaya- female -gelba- yellow (scalar state) [also 'banana' (basic noun)] -gua- liquid (binary state) -hayu- heavy (scalar state), [also 'bear' (basic noun)] -ja- speech morpheme -jandoya- congratulate (speech act) -kanti- having quantity/amount (scalar state) -kapsu- same, equal -ke- up/above (locative state) -lenga- spatially long (scalar state) -lo- similar, like, about, approximately -me- located at/in (locative state) -muzge- grey (scalar state) -pau- contingency relationship, P is contingent on F -sawa- temporally long (scalar state) -sifne- reciprocal 'back/in return' -tenci- smart/intelligent (scalar state) -teyo- knowing/knowledgeable (mental state) -tomba- hedge -veya- real/existent (binary state) -xau- hot (scalar state) -xava- black (scalar state) -xawe- clean (scalar state) -xenda- feeling love/affection -xoya- alive/living (binary state) -xumpi- white (scalar state) -ze- generic action verbs (default A/P-s) Particles: Comparatives: taka 'than' geka 'more' pika 'most' kapsuka 'as much/many as' soka 'less' juka 'least' getaka = "geka taka" 'more than' pitaka = "pika taka" 'the most among' kapsutaka = "kapsuka taka" 'as much/many as' sotaka = "soka taka" 'less than' jutaka = "juka taka" 'the least among' Resumptive pronoun: ka Genitive: xaka Heavy topicalization particle: pika Referent-switching particle: boka Coordination initiator: ceka Coordination terminator: saksika Parenthetical start: suka Parenthetical stop: complete toka incomplete mika Other: neka 'and' pauka 'then' pauvika 'if' dengaka 'crap' deka 'shit' ********** The End **********