linguistic attributes (lemma, pos and features) for TEIEduard Drenth.

Author: Eduard Drenth2024-06

Schema corpora_linguistics: Elements

<link> Developed by the Fryske Akademy. june, 2024. syntactic relations: https://universaldependencies.org/u/dep/index.html
Modulelinking
Attributes
lemma
Status Optional
Datatype teidata.enumerated
ana
Status Optional
Datatype teidata.enumerated
Legal values are:
acl
https://universaldependencies.org/u/dep/acl.html Clausal modifier of nominal
acl:relcl
https://universaldependencies.org/u/dep/acl-relcl.html Relative clausal modifier of nominal
advcl
https://universaldependencies.org/u/dep/advcl.html Adverbial clause modifier
advmod
https://universaldependencies.org/u/dep/advmod.html Adverbial modifier
amod
https://universaldependencies.org/u/dep/amod.html Adjectival modifier
appos
https://universaldependencies.org/u/dep/appos.html Appositional modifier
aux
https://universaldependencies.org/u/dep/aux.html Auxiliary
aux:pass
https://universaldependencies.org/u/dep/aux-pass.html Passive auxiliary
case
https://universaldependencies.org/u/dep/case.html Case marking
cc
https://universaldependencies.org/u/dep/cc.html Coordinating conjunction
ccomp
https://universaldependencies.org/u/dep/ccomp.html Clausal complement
cc_preconj
https://universaldependencies.org/u/dep/cc_preconj.html Preconjunct
conj
https://universaldependencies.org/u/dep/conj.html Conjunct
cop
https://universaldependencies.org/u/dep/cop.html Copula
csubj
https://universaldependencies.org/u/dep/csubj.html Clausal subject
dep
https://universaldependencies.org/u/dep/dep.html Unspecified dependency
det
https://universaldependencies.org/u/dep/det.html Determiner
discourse
https://universaldependencies.org/u/dep/discourse.html Discourse element
expl
https://universaldependencies.org/u/dep/expl.html Expletive
expl:pass
https://universaldependencies.org/u/dep/expl-pass.html reflexive pronoun
expl:pv
https://universaldependencies.org/u/dep/expl-pv.html reflexive clitic
fixed
https://universaldependencies.org/u/dep/fixed.html Fixed multiword expression
flat
https://universaldependencies.org/u/dep/acl.html Flat multiword expression
flat_foreign
https://universaldependencies.org/u/dep/flat_foreign.html Flat multiword expression: foreign
flat_name
https://universaldependencies.org/u/dep/flat_name.html Flat name
iobj
https://universaldependencies.org/u/dep/iobj.html Indirect object
mark
https://universaldependencies.org/u/dep/mark.html Marker
nmod
https://universaldependencies.org/u/dep/nmod.html Nominal modifier
nmod:poss
https://universaldependencies.org/u/dep/nmod-poss.html Possessive nominal modifier
nsubj
https://universaldependencies.org/u/dep/nsubj.html Nominal subject
nsubj:pass
https://universaldependencies.org/u/dep/nsubj-pass.html Passive nominal subject
nummod
https://universaldependencies.org/u/dep/nummod.html Numeric modifier
obj
https://universaldependencies.org/u/dep/obj.html Object
obl
https://universaldependencies.org/u/dep/obl.html Oblique nominal
parataxis
https://universaldependencies.org/u/dep/parataxis.html Parataxis
punct
https://universaldependencies.org/u/dep/punct.html Punctuation
root
https://universaldependencies.org/u/dep/root.html Root
xcomp
https://universaldependencies.org/u/dep/xcomp.html Open clausal complement
compound:prt
https://universaldependencies.org/u/dep/compound-prt.html a verbal compound
Contained by
May containEmpty element

<linkGrp>

<linkGrp> Developed by the Fryske Akademy. june, 2024. Contents adheres to https://universaldependencies.org.
Modulelinking
Attributes
type
Status Optional
Datatype teidata.enumerated
Suggested values include:
UD-SYN
syntactic relations: https://universaldependencies.org/u/dep/index.html
targFunc
Status Optional
Datatype teidata.enumerated
Suggested values include:
head argument
function of first entry in link/@target is head, of second entry is argument
Contained by
May containEmpty element

<m>

<m>
Moduleanalysis
Attributes
  • att.linguistic
    • @pos
  • att.features
    • @islemma
    • @abbr
    • @poss
    • @reflex
    • @prefix
    • @prontype
    • @case
    • @tense
    • @voice
    • @number
    • @person
    • @verbtype
    • @verbform
    • @polite
    • @numtype
    • @degree
    • @mood
    • @gender
    • @hyph
    • @prodrop
    • @clitic
    • @inflection
    • @suffix
    • @valency
    • @convertedfrom
    • @predicate
    • @construction
Contained by
May containEmpty element

<TEI>

<TEI>
Attributes
linguisticsversion
Status Required
Datatype teidata.enumerated
Legal values are:
4
3
2
1
Contained by
May containEmpty element

<w>

<w>
Moduleanalysis
Attributes
  • att.features
    • @islemma
    • @abbr
    • @poss
    • @reflex
    • @prefix
    • @prontype
    • @case
    • @tense
    • @voice
    • @number
    • @person
    • @verbtype
    • @verbform
    • @polite
    • @numtype
    • @degree
    • @mood
    • @gender
    • @hyph
    • @prodrop
    • @clitic
    • @inflection
    • @suffix
    • @valency
    • @convertedfrom
    • @predicate
    • @construction
Contained by
May containEmpty element

Schema corpora_linguistics: Attribute classes

att.features

att.features Developed by the Fryske Akademy. july, 2019. Contents adheres to https://universaldependencies.org.
Moduleanalysis
Membersm w
Attributes
islemmahttps://universaldependencies.org/u/overview/morphology.html Boolean, is this a base form
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
abbrhttps://universaldependencies.org/u/feat/Abbr.html Boolean feature. Is this an abbreviation?
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
posshttps://universaldependencies.org/u/feat/Poss.html Boolean feature. Is this word possessive?
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
reflexhttps://universaldependencies.org/u/feat/Reflex.html Boolean feature, typically of pronouns or determiners. It tells whether the word is reflexive, i.e. refers to the subject of its clause.?
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
prefixhttps://universaldependencies.org/u/feat/Prefix.html Boolean feature, Is this a prefix word in a compound, that usually cannot stand on its own?
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
prontypehttps://universaldependencies.org/u/feat/PronType.html This feature typically applies to pronouns, pronominal adjectives (determiners), pronominal numerals (quantifiers) and pronominal adverbs.
Status Optional
Datatype teidata.enumerated
Legal values are:
prs
personal pronoun or determiner
rcp
reciprocal pronoun
art
Article is a special case of determiner that bears the feature of definiteness
int
interrogative pronoun, determiner, numeral or adverb
rel
relative pronoun, determiner, numeral or adverb
ind
indefinite pronoun, determiner, numeral or adverb
emp
Emphatic pro-adjectives (determiners) emphasize the nominal they depend on.
exc
exclamative determiner
dem
Demonstrative pronouns are often parallel to interrogatives.
casehttps://universaldependencies.org/u/feat/Case.html Case is usually an inflectional feature of nouns.
Status Optional
Datatype teidata.enumerated
Legal values are:
nom
nominative
acc
accusative
dat
dative
gen
genitive
ins
instrumental / instructive
par
partitive
tensehttps://universaldependencies.org/u/feat/Tense.html Tense is typically a feature of verbs.
Status Optional
Datatype teidata.enumerated
Legal values are:
past
past tense
pres
present tense
fut
future tense
voicehttps://universaldependencies.org/u/feat/Voice.html Voice is typically a feature of verbs.
Status Optional
Datatype teidata.enumerated
Legal values are:
act
The subject of the verb is the doer of the action (agent).
pass
The subject of the verb is affected by the action (patient).
numberhttps://universaldependencies.org/u/feat/Number.html Number is usually an inflectional feature of nouns.
Status Optional
Datatype teidata.enumerated
Legal values are:
sing
A singular noun denotes one person, animal or thing.
plur
A plural noun denotes several persons, animals or things.
ptan
Plurale tantum, some nouns appear only in the plural form even though they denote one thing.
coll
Collective or mass or singulare tantum applies to words that use grammatical singular to describe sets of objects.
personhttps://universaldependencies.org/u/feat/Person.html Person is typically feature of personal and possessive pronouns / determiners, and of verbs.
Status Optional
Datatype teidata.enumerated
Legal values are:
1
The first person refers just to the speaker / author and in plural one or more additional persons.
2
The second person refers to the addressee(s).
3
The third person refers to one or more persons that are neither speakers nor addressees.
verbtypehttps://universaldependencies.org/u/feat/VerbType.html distinctions on top of verb and aux.
Status Optional
Datatype teidata.enumerated
Legal values are:
mod
Verbs that take infinitive of another verb as argument and add various modes of possibility, necessity etc.
tense
Verb used to create periphrastic verb forms (tenses, passives etc.).
verbformhttps://universaldependencies.org/u/feat/VerbForm.html form of verb or deverbative.
Status Optional
Datatype teidata.enumerated
Legal values are:
inf
Infinitive is the citation form of verbs in many languages.
part
Participle is a non-finite verb form that shares properties of verbs and adjectives.
ger
Gerund is a non-finite verb form that shares properties of verbs and nouns.
conv
The converb, also called adverbial participle or transgressive, is a non-finite verb form that shares properties of verbs and adverbs.
politehttps://universaldependencies.org/u/feat/Polite.html Various languages have various means to express politeness or respect.
Status Optional
Datatype teidata.enumerated
Legal values are:
infm
usually meant for communication with family members and close friends.
form
usually meant for communication with strangers and people of higher social status.
numtypehttps://universaldependencies.org/u/feat/NumType.html numeral type.
Status Optional
Datatype teidata.enumerated
Legal values are:
ord
ordinal number (first, second,..)
card
cardinal number (one, two, many,....)
degreehttps://universaldependencies.org/u/feat/Degree.html Degree of comparison is typically an inflectional feature of some adjectives and adverbs.
Status Optional
Datatype teidata.enumerated
Legal values are:
pos
positive, first degree
cmp
comparative, second degree
sup
superlative, third degree
dim
Added to features in universaldependencies. Diminutive.
moodhttps://universaldependencies.org/u/feat/Mood.html Mood is a feature that expresses modality and subclassifies finite verb forms.
Status Optional
Datatype teidata.enumerated
Legal values are:
imp
The speaker uses imperative to order or ask the addressee to do the action of the verb.
sub
The subjunctive mood is used under certain circumstances in subordinate clauses, typically for actions that are subjective or otherwise uncertain.
ind
A verb in indicative merely states that something happens, has happened or will happen.
genderhttps://universaldependencies.org/u/feat/Gender.html gender.
Status Optional
Datatype teidata.enumerated
Legal values are:
masc
masculine gender
fem
feminine gender
neut
neuter gender
com
Some languages do not distinguish masculine/feminine but they do distinguish neuter vs. non-neuter. The non-neuter is called common gender.
hyphhttps://universaldependencies.org/u/feat/all.html#hyph-hyphenated-compound-or-part-of-it Is this part of a hyphenated compound? Depending on tokenization, the compound may be one token or be split to several tokens; then the tokens need tags.
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
prodropAdded for Frisian to MISC in universaldependencies. pronoun drop, omission of pronouns because they can be inferred
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
cliticAdded for Frisian to features in universaldependencies. Most personal pronouns have a clitic form, which is the result of either vowel deletion, vowel reduction, monophthongization or schwa deletion, while there are also cases of suppletion.
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
inflectionNot in universaldependencies. The modification of a word to express different grammatical categories such as tense, case, voice, aspect, person.
Status Optional
Datatype teidata.enumerated
Legal values are:
infl
Not in universaldependencies. inflected
uninf
Not in universaldependencies. uninflected
suffixNot in universaldependencies Boolean feature, Is this a suffix word in a compound, that usually cannot stand on its own?
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
valencyNot in universaldependencies. Verb valency or valence is the number of arguments controlled by a verbal predicate.
Status Optional
Datatype teidata.enumerated
Legal values are:
1
An intransitive verb takes one argument (no object)
2
A monotransitive verb takes two arguments (of which one object)
3
A ditransitive verb takes three arguments (of which a direct and an indirect object)
convertedfromNot in universaldependencies. Words belonging to one part of speech category used as another category.
Status Optional
Datatype teidata.enumerated
Legal values are:
adj
Not in universaldependencies. adjective used as another category
adv
Not in universaldependencies. adverb used as another category
ver
Not in universaldependencies. verb used as another category
num
Not in universaldependencies. numeral used as another category
pro
Not in universaldependencies. pronomen used as another category
part
Not in universaldependencies. verbform part used as another category
predicateNot in universaldependencies. Predicate.
Status Optional
Datatype teidata.enumerated
Legal values are:
yes
Not in universaldependencies. statement about the subject
constructionNot in universaldependencies. Construction.
Status Optional
Datatype teidata.enumerated
Legal values are:
attr
Not in universaldependencies. attributive

att.linguistic

att.linguistic 
Membersm
Attributes
poshttps://universaldependencies.org/u/pos/index.html These tags mark the core part-of-speech categories.
Status Optional
Datatype teidata.enumerated
Legal values are:
adj
Adjectives are words that typically modify nouns and specify their properties or attributes.
adp
Adposition is a cover term for prepositions and postpositions.
adv
Adverbs are words that typically modify verbs for such categories as time, place, direction or manner.
aux
An auxiliary is a function word that accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb, such as person, number, tense, mood, aspect, voice or evidentiality.
cconj
A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.
det
Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context.
intj
An interjection is a word that is used most often as an exclamation or part of an exclamation.
noun
Nouns are a part of speech typically denoting a person, place, thing, animal or idea.
num
A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.
part
Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech.
pron
Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.
propn
A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object.
punct
Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text.
conj
Not in universaldependencies. A conjunction is a conjunction that links constructions, where no assumption about the role of the constructions is made.
sconj
A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other.
sym
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
verb
A verb is a member of the syntactic class of words that typically signal events and actions.
x
The tag X is used for words that for some reason cannot be assigned a real part-of-speech category.
Eduard Drenth. Date: 2024-06