Annotation guidelines
PARSEME shared task on automatic identification of verbal MWEs - edition 1.0 (2017)

Categories of verbal MWEs

In this task we distinguish the following categories of verbal MWEs:

  • Two universal categories, i. e. valid for all languages participating in the task:
    • light verb constructions (LVC):
      • държа под контрол to keep under control
      • eine Rede haltena speech holdto give a speech
      • κάνω μία βόλτα make-1SG a walk to walk
      • to give a lecture
      • hacer una foto to_make a picture to take a picture
      • avoir du courage to have courage
      • fare un discorsogive a speechto give a speech
      • ħa deċizjoni took a decision
      • podjąć decyzję to take a decision
      • fazer uma promessa to make a promise
      • a lua o decizie to take a decisionto make a decision
      • imeti predavanje, sprejeti odločitev to take a decision
    • idioms (ID):
      • правя се на дръж ми шапката to behave myself as 'hold my hat' pretend to be naive and innocent
      • schwarz fahren to drive black take a ride without a ticket, in Kraft treten into force step to come into effect, in die Waagschale werfen in the weighing pan throw to bring to bear
      • χάνω τα αυγά και τα καλάθια loose-1SG the eggs and the baskets to be at a complete and utter loss
      • to go bananas, fortune favors the bold
      • hacer de tripas corazón make of intestines heart to pluck up the courage
        entrar en vigor enter in vigor to come into force/effect
      • défendre son bifteck defend one's beefsteak to defend one's interests
      • entrare in vigore to enter into force to come into effect, gettare le perle ai porci to throw the pearls to the pigs to waste something good on someone who doesn't care about it
      • għasfur żgħir qalli a bird small told me to hear something from the grapevine
      • rzucać grochem o ścianę throw peas agains a wall to try to convince somebody in vain
      • fazer das tripas coração transform the tripes into heart to try everything possible
      • a trage pe sfoară to pull on rope to fool
      • ubiti dve muhi na en mah to to achieve two aims at once, spati kot ubit sleep like dead sleep soundly
  • Two quasi-universal categories, valid for some language groups or languages but not all:
    • inherently reflexive verbs (IReflV):
      • усмихвам се to smile
      • sich bemühen to endeavour, sich enthalten himself contain to abstain
      • n.a.
      • suicidarse to suicide
      • se suicider to suicide
        quejarse to complain
      • suicidarsi to suicide
      • bać się to fear SELFto be afraid
      • se queixar to complain
      • a se gândi to think
      • bati se to be afraid
    • verb-particle combinations (VPC):
      • not applicable to Bulgarian
      • er gibt auf he gives up, er wirft ihr das vor he throws her that against he reproches that to her
      • μπαίνω μέσα get in to go bankrupt
      • to do in
      • n.a.
      • buttare giù throw down to swallow
      • not applicable to Polish
      • jogar fora This seems to be the only VPC in Portuguese. We annotate it as ID and do not use the VPC category.
      • n.a.
      • dati skozi give through to go through, gre za it goes about it is about
  • language-specific categories, defined for a particular language in a separate documentation.
  • other verbal MWEs (OTH), which gather the types not belonging to any of the categories above:
    • цъфна и вържа to blossom and give fruit (usually sarcastically) to prosper
      река и отсека to say and cut to say firmly, decisively
    • einen drauf setzen going one better
    • απορώ και εξίσταμαι wonder1SG.PST and be-amazed1SG.PST to wonder
    • to drink and drive
      to voice act
      to pretty-print
      to short-circuit
      to tumble dry
    • coser y cantarto_sew and to_singeasy as pie, a piece of cake
    • court-circuiter to short-circuit
    • andare e venire to come and goback and forth
      to short-circuit
    • iqum u joqgħod jump and stay to fidget
    • pluć i łapać to spit and catch to be lazy, to do nothing useful
    • pintar e bordar paint and knit to abuse
    • a tunat și i-a adunatit.has thundered and CL.ACC-it.has gatheredbirds of a feather flock together

In practice, to identify and categorize verbal MWEs during manual annotation, one must use the rigorous generic and category-specific tests provided.