It is however also possible in Dutch to use an adjective like 'klein' (little) and even to combine both: kleine bloem (little flower) -> klein bloempje (small little flower)Ī last peculiarity I should mention is that in Dutch (but also in many other languages) there are words which only have a diminutive form, like 'meisje' (girl). This is the general rule, because the suffix can als be inflicted, like in bloem (flower) -> bloempje (little flower)Īnd in some words te ending consonant is changed to keep the word pronouncable: hemd (shirt) -> hempje (little shirt) In English you form the diminutive by adding an adjective like 'little', but in Dutch the general rule to form a diminutive is to add the suffix "je" to the word, e.g.: huis (house) -> huisje (little house) In a number of languages, like Dutch, German, Polish and many more, diminutives are created by inflicting the word. NB: if you are interested in this kind of things, a classic book about language changes is Jean Aitchisons Language change: progress or decay (1981, yes, it is a bit pre-internet.) Diminutives One exampe of a verb which is currently in transition from strong to weak is the verb "graven" (to dig) of which both the form "hij groef" (he digged) and "hij graafde" can also be found, although most language-purist would consider the last form as "wrong". These examples make clear that determening which verb is strong and which verb weak is indeed a case of learning by heart.įurthermore the change from strong to weak verbs is a ongoing process. While an example of a weak verb is "rennen" (to run): hij rent (he runs) -> hij rende (he ran) When I say the systems co-exist, one should note that most (native) Dutch speakers are not aware of the fact that the strong-system is a system at all: they consider the strong verbs to be exceptions, best learned by heart.Īn example of a strong verb is "lopen" (to walk): hij loopt (he walks) -> hij liep (he walked) In Dutch, for forming the past tense of a verb, two types of conjugation co-exist: the (pre-) medieval system, now called strong and the more recent system, called weak. ![]() In contrasts with: krik (car jack)-> krikken (car jacks)Ĭonjugation of verbs in Dutch is, to be blunt, a bit of a mess. In Dutch more or less the same situation exists, be it with different suffixes ("s", "'s", "en") and, of course, other exceptions.įurthermore, in Dutch if the stem ends on a consonant directly preceded by a vowel, this consonant is doubled (otherwise, in the plural form, the vowel would sound like a long vowel instead of a short vowel), e.g.: kat (cat) -> katten (cats)īut, to this rule there also are exceptions, like monnik (monk) -> monniken (monks) On of the things you absolutely want your user to be able to, is to find results which contain the single form of a word while searching for the plural and vice versa, e.g.: finding "cat" when looking for "cats" and finding "cats" when searching for "cat".Īlthough in English there are well-defined rules for creating the plural form (suffix with "s", "es" or change "y" to "ie" and suffix "s"), there also are a number of irregular nouns ("woman" -> "women") and nouns for which the single and plural form are the same ("sheep", "fish"). We will first present a few examples of stemming in natural language, and since Dutch is my native language I will concentrate on Dutch examples.Īfter that we will show the results of a number of stemmers present in Solr and give a few pointers about what to do if the results of these stemmers are not good enough for your application. In this case the stem is "walk" which, in English, also happens to be the infinitive of the verb. For example, in the sentence "he walks" the verb is inflected by adding a "s" to it. ![]() ![]() ![]() So, stemming, what is stemming? Generally speaking, stemming is finding the basic form of a word.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |