Fish Tales

Naturally, there’s a lot of competing views surrounding language documentation methods. A question at the center of the debate is, what are we aiming to document? Some say that by collecting narratives and stories we’re merely attempting to preserve an idealized form of the language that no longer exists. The “Ancestral Code” is but an abstraction; perhaps it never existed. Though even among the most developed countries like the U.S., stories such as Cinderella are as important a component of our culture and language as are mining the latest hashtags and messaging conversations. We’d be remiss to ignore or avoid these genres, especially given the relevance of indirect communication in the West African context.

Yesterday, I recorded some folktales with the president of the women’s association. Not only were the stories fascinating and full of surprises, the narrator described to me the manner in which she’d first been given the knowledge:

Life here resolves around rice cultivation, which is in and of itself tedious and exhausting work. When the farmers come home from the fields, they need nourishment and rest. Young children and elders remain at home during the day, but they’re not idle. Mothers and grandmothers gather lotus roots from the ponds and palm fruits from the trees. Together with the children, they extract sweet snacks for the men and women to enjoy when they return from the fields. Instead of paying those who help preparing the fruits and tubers, the mothers and grandmothers used to recount stories to their children. The children would be so enthralled in the stories, they’d sit quietly and assist with the work for hours on end.

Now, thanks to development, cell phones and televisions have replaced the transmission of oral information and the elders, of course, lament this change. Yet, none of us wants to hinder progress, so the least we as researchers of the culture can do is use the tools at our disposal to everyone’s advantage by employing technology to capture the fading past and integrate it into the present via memory cards and CD’s.

While we, as linguists, may not be accurately representing the actual state of language use by gathering such tales, at least we’re adding to the larger picture of what it means to be a part of the culture, of which language is an integral part.

Advertisement

LLACAN and ELAN: Guide to ELAN-CorpA

I feel so fortunate to have opportunities to travel and learn from the world’s linguistic and computational experts that I want to be sure to share with everyone! Also, as I have stated before, I have no background in informatics or computer science so believe me, if I can learn it, so can you.

Today’s lesson covers ELAN-CorpA, currently curated by LLACAN’s own Christian Chanard. Many of you Mac users have tried to download the program without success. That is because the latest Mac OS does not permit applications from unknown developers. The way to get around this is the following:

1. Download ELAN-CorpA here

Screen Shot 2018-09-21 at 14.54.49
Open attempting to open the application, if you are a Mac user, you will this error.

2. Open Terminal and paste the following (full description here):

sudo spctl --master-disable

3. Open System Preferences on your computer and go to Security & Privacy

4. Click the lock at the lower left hand of the screen

5. You will see that the Anywhere Option has appeared, so click there

6. Return to your downloads and open the ELAN-CorpA application

7. Reset your security settings with:

sudo spctl --master-enable

You will also need to download a template (available from the LLACAN website for PC or MAC).

Now that you have installed ELAN-CorpA and downloaded the Lexical Template, you can open any ELAN file. Although ELAN-CorpA looks and works almost exactly like ELAN, the advantage of ELAN-CorpA over ELAN-simple, as it were, is that you have the option to import or create a lexicon, thereby bypassing the FieldWorks vs Toolbox debate. That is, whereas my, and likely many of our, workflows include exporting and importing back and forth between FLEx or Toolbox and ELAN, by using ELAN-CorpA, we can simply gloss and interlinearize within the program itself. Here is how:

From ELAN-CorpA, you can open any ELAN file. For the purposes of this tutorial, we will use one that was exported from Toolbox as it is very close to the downloaded template.

  1. Go to Type → Changer Tier Type → tx Type Name => change to txx → change → closeGo again to Type → Import Types → Browse → select the downloaded template (.eft) → import → close (it will appear as if nothing has happened, but it has)
  2. Go to Tier → Copy Tier → Select (the first if there are many) tx tier → next → Select the same tx tier → next → select tx as the Type Name → Finish (you will now have a new tier which is a copy of the old tx tier)
  3. Go to Tier again → Change Tier Attributes → select the old tx tier → change tier name to ref@SP1 → change Tier Type to ref → Change
    Select copied tx tier → change tier name to tx@SP1 → Change → select ft tier → change tier name to ft@SP1
  4. If there is a word tier and/or a ref tier without any children, delete those
  5. Repeat for other participants in the file, except change SP1 to SP2
  6. Go to Tier → Tokenize Tier → Source tier (parent tier) is tx@SP1 → Destination tier → Create New Tier → Tier Name is mot@SP1 → Parent Tier is tx@SP1 → Tier Type is mot → Add → Close → Start → Close (you will now see that your transcribed tx tier is parsed into words on the mot tier)
  7. Repeat for each speaker
  8. Now, importing a lexicon from Toolbox is quite simple as there is already an existing txt file with your lexicon in it. From FieldWorks, do the following:
    In FieldWorks (FLEx), export the Full Lexicon (root-based) into SFM in a text file →
    open in Notepad ++ → search and replace g_en with ge and ps_en with ps (remove any underscores from the file)
  9. Back in ELAN-CorpA, go to Interlinearize tab → Lexicon → Import → select your lexicon.txt → align each field from the exported lexicon with its counterpart in ELAN-CorpA:
  10. Screen Shot 2018-09-21 at 16.44.01
  11. Select the field on the left, then its equivalent on the right, then click the >> in the middle to associate them (\lx is lexeme, \g is gloss, \ps is part of speech, \dt is date, ignore the others) → OK (now your lexicon should appear)
  12. On the right, under Parse & Annotate, next to Full Lexicon, select Interlinearize → a screen will appear: Screen Shot 2018-09-21 at 16.53.09
  13. Select Define interlinearization hierarchy → choose mot@SP1 as the interlinearization tier → OK → change to look like this: → Create TiersScreen Shot 2018-09-21 at 16.55.06
  14. Click on any word in mot@SP1 → click Interlinearize between Full Lexicon and Autointerlinearize
  15. If the word is not in the lexicon, it will have an asterisk beside it *
  16. If the word has the asterisk, right click it → Insert Record *word
  17. If the word is composed of more than one morpheme, delete any affixes → give a Gloss and part of speech (Tier X) → save record → OK
  18. Click again on Interlinearize (repeat process for any affixes as necessary until no asterisk remains) → to the left, click each word in the lexicon (if the word is made up of more than one morpheme) → click on the right on each word (if the word is made up of more than one morpheme) → click Auto Interlinearize (the word will now be interlinearized on the newly made tiers)
  19. Repeat for each speaker and throughout the text
  20. Save your new interlinearized, time-aligned ELAN-CorpA file into a folder where you will be proud to see it and then create others to have a searchable corpus!

Bonus steps:

Especially for tonal languages, if you would like to have a pitch track to use for transcription, do the following

  • Open (not long) sound file in Praat
  • Manipulate – to manipulate
  • Minimum and maximum pitch 50–500
  • Save newly created pitch tier as text file
  • in ELAN-CorpA, edit linked (secondary) files, add .pitchtier file
  • View – viewer – Signal viewer and time viewer
  • Merge – right click in sound, stereo, merge

Links and resources:

http://llacan.vjf.cnrs.fr/ELAN-CorpA/download/51/install.htm
http://corpafroas.tge-adonis.fr/elan/install.htm

http://llacan.vjf.cnrs.fr/res_ELAN-CorpA.php

CorpaFroas – Online corpus:
http://corpafroas.huma-num.fr/Archives/corpus.php

Unrecognized Royalty

While my deepest sympathies go out to the British Royal family and those who mourn Her Majesty, I want to take this time to honor a far less recognized Royal family. The Chief of Bounou, in other words, the King of the Bangande, who also has passed several jubilees in his position, remains a dear friend of mine who has and will continue to contribute greatly to my research on the Bangime language and speakers.

While Chief Só Dicko does not inhabit a castle per se, the entire village of Bounou sits atop a cliff, giving the Bangande people unparalleled access to approaching potential danger from people, animals, or floods. This village serves as the gateway to all the other six Bangande settlements.
My own journey in beginning to unravel the mystery of the Bangande people and language did begin in a castle. The Workshop on Language Islands in Africa was held in 2017 at the Ebernburg Castle of Bad Münster am Stein-Ebernburg in Germany. Please read and comment on my and Johann‐Mattis List’s latest publication Bangime: secret language, language isolate, or language island? A computer‐assisted case study and stay tuned for further updates about the origins of the ‘Small Bang’.

Can the Caste Ladder be Climbed?

I recently read the long awaited book Caste. Well, I say read but I actually listened to the audio version from the North Carolina public library. I was on the waiting list for months. It’s an impressive work, no doubt; the amount of research Wilkerson put into it is phenomenal. However, my problems with the book arose from the beginning. I didn’t get the anonymous introduction to current events. Don’t get me wrong, dystopia is among my favorite genres, but not when its immersed in a work of non-fiction.

My next (much bigger) problem was that, while I do understand the importance of being a witness to historical events, I was not expecting this book to be so largely dedicated to providing shocking accounts of how slaves, and their descendents, were treated in the American south. I feel somewhat awkward saying this as a white woman who grew up in North Carolina. I suppose I could be accused of being biased. However, what bothered me so much about this central aspect of the book is that I felt it was unexpected, unnecessary, and, frankly, gratuitous.

At least in my view, the excessive violence and collective trauma that characterized slavery in America is a symptom of the so -called caste system rather than its origin. It’s for this reason that I felt Wilkerson’s detailed descriptions of the gruesome treatment of slaves, their ancestors, and her drawing parallels with the Holocaust in this regard were misplaced. Certainly some still need to learn these shocking realities lest they are forgotten and repeated, but I believe these triggers detract from the task at hand; in fact I feel they mask the real goal that I had expected the book to address: what are the origins of Wilkerson’s so-called caste system in America, and, is she right in depicting it as such?

Although, admittedly, I felt forced to skip many sections of the book due to the above outlined concerns – I simply wasn’t prepared to read/listen to lengthy descriptions of torture – I still did not find answers to these questions for which I had awaited so long. Foremost, I did not understand why Wilkerson insisted on following the caste analogy when her destinctions are largely binary: Black = “subordinate caste” while w(W?)hite = “dominant caste”. A much more interesting ‘thought experiment’, in my opinion and that of others who have reviewed the book, is that of how do all the multitude of layers of not only race, but also perhaps more importantly, gender and economic status, co-existing in America interact with each other and the larger governing bodies? Could this system be called a ‘caste’? I think it depends on which caste system the analogy is based.

Here, I found it especially disappointing that, despite Wilkerson’s claim of researching anything and everything to do with caste, in a book mostly about the history of the descendents of West Africans, she didn’t look to the caste system of West Africa. For me at least, this is a huge missed opportunity to uncover a far less studied and perhaps more analogous caste system that persists today than the one she chose, the much more well-known but seemingly less connected, caste system of India. For instance, many would be shocked to learn that Africans sold other Africans into slavery during the Trans-Atlantic slave trade. She largely ignores Africa, or worse, especially given the sheer breadth of her research, makes strikingly naïve statements concerning “tribes” and, albeit citing someone other than herself, that there exist no black people in Africa; race is something that only becomes apparent in America.

From my own experiences in West Africa, one of them as recent as yesterday, thus prompting me to pen this post, racism has nothing to do with race, in that race is equivalent to color at least. My husband Momo and I are vacationing at a sea-side resort in south-western Senegal. We’ve come here for years and have watched the owner build the place up from practically nothing to the point where we’re considered part of the family by our loyalty. Thus, when 30+ guests were expected for lunch, we were flexible with our plans and ammenable to changes. However, the guests, who not only arrived late, but also came in double their anticipated number, wanted to be attended to immediately. I should clarify at this point that the guests were primarily Senegalese from Dakar.

Our first reaction to them was surprise, as, in their haste to all be seated at the table that was set for the expected 30 people, not one person said hello to us; anyone who knows anything about West Africa knows this is usually a grievous error. Instead, those who couldn’t be seated right away glared at us as if it were our fault that they’d not informed the establishment of their actual number. And in fact, I was later to learn, they didn’t believe that the owner of the resort, also Senegalese, was who he said he was. They expected a “white” person, at least a foreigner, rather than a co-patriot to be running a successful resort. Despite Momo and I clearly also being guests, we felt there was resentment from the outset.

We decided to move, to allow our hosts to use our table, though our intention was not to accommodate the rude intruders but to help out our friends who were struggling to accommodate the unexpected arrivals. We already had a small table set up in the sand anyways, where Momo liked to make tea and I’d hung my yoga hammock from a tree. After we finished lunch, we got settled – Momo making tea and me reading just sitting under my hammock. One of the guests actually has the audacity to send over one of the staff to order Momo to make him a glass of tea! Naturally, Momo refused. The guest, the owner of a business that we support no less, ends up coming over, commanding Momo to give him a glass of tea. Yet, there’s still no greeting or any other form of politeness. Momo proceeds to explain to him, in a manner much more civilized than I would have managed at this point, that he didn’t appreciate not having been acknowledged yet asked for a service. The other man apologized, they exchanged pleasantries, and tea was provided.

Unfortunately, the other guests didn’t get the message, or they did and didn’t care, because many of them continued to harangue Momo for tea and made rude remarks when he said it wasn’t yet ready. Finally, they left and peace and tranquility were restored, but their mark was left on all of us. I remarked that I’d not experienced that type of disparity before in Casamance. Momo and the resort staff remarked that it often happens when dealing with “northeners”, those from Dakar especially. With caste so fresh in my mind, I hypothesized that this very same caste system, much more actively felt and practiced in the north of Senegal than in Casamance, was the cause of the strangers’ behavior. They perceive themselves as being much higher on the ladder than Momo; little do they know that he not only can be make great tea, he also cooks in a Michelin star restaurant in Paris. In any case, caste is surely more profound than skin deep, with subconscious clues causing its unknowing participants to often make poor choices which parpituate its existence. I hope these experiences will at least serve to promote some self-reflection as they have for me.

What is poverty? A brief encounter with a multinational fishing community

Suffering is an integral part of what it means to be poor. Yesterday I visited a fishing encampment where Rémy was chief for two years. The people, almost entirely men, live in shabby lean-to’s sunk into the mud of the mangroves. The name of the village, Kahemba, is also a small island; it’s impossible to dig a well so they’re forced to row dug out canoes over to the mainland, draw water from the well, fill numerous, huge containers of water, load them back into the canoe, and row home. Not only do the people need water for drinking, washing, and bathing, in that order of importance, but also enough to give to their animals: pigs and chickens.

There’s few women on the island, but those who have come here from other afar have come with their families but they remain in the regional capitol city Ziguinchor. The women who live on Kahemba are obviously tough as nails.

screenshot-2019-01-17-13.39.58.png

I came to Kahemba because of my interest in their linguistic practices, but found myself equally interested in their culture and economics. I interviewed one of the settlements’ first residents; we spoke in Bamana as he’s from Mali. He’s lived in the village for nearly forty years, yet his Bamana seems barely tainted by Senegalese Mandinka nor Guinean Susu, despite being in constant contact with speakers of both, related, languages. He tells me how the fishing encampment as such began because of fishermen from Guinea who taught local Jóola residents methods for smoking and thus preserving the fish so that they could be transported to countries without the bountiful harvest as it were.

I ask him as well as a Toucoulor speaker how their life is here, is it worth it to endure these conditions, the mud must be mad in the rainy season, in order to make a living for themselves and their families? They explain how, at one time, they made a lot of money and had few expenses, so yes, it was worth it. But now, they tell me, due to greedy fishing practices, they don’t gain as much profit and they have to go further and farther to find good fish. Plus, they’re heavily monitored by the nearest village so that their profits don’t go too far outside the region’s borders, which is of course the reason why most of the fishermen are here, to send back to their countries of origin.

Which brings me back to the original question, what is poverty? Because in a way, the fishermen at Kahemba have money, but is what they have to go through in order to get it that is beyond difficult. And that’s a huge issue in places like West Africa, the issue of not having enough paid jobs to sustain the population. There’s not an even distribution between work and money. Even if it’s not poverty in the strictest sense of the word, it’s still undue suffering.

Fieldwork is not homework

I’m so fortunate to be able to live and work in this magically beautiful realm, and yet, a great working environment is still a workplace, and nothing compares to being home. The language consultants with whom I’m working are tireless; they push me when I’m ready to quit. And by all other standards, my field site is quite cushy: you can see there is tap water and electricity in the picture – that is, solar power that provides water through a pump from the well and batteries that hold a charge overnight for charging equipment. Nevertheless, transcription is grueling, even in a language without tones. The dipthongs and consonant lenition make up the difference in difficulty.

Plus there’s always issues of ethics and compromise. One of the consultants with whom I work is a fisherman, another is a hunter. Neither will accept for me to buy food in the market, they both insist on obtaining and providing us with their own, even if it means we’ll be eating Bambi or Flipper for supper (just kidding actually they’d never kill a dolphin, but a shark or baracuda are fair game.

Then there’s the interrogations . I should be empathic about interviews as I’m here in order to research the people’s culture, and the Jóola in particular are incredibly open to sharing even their most intimate events with me such as the traditional funeral I was permitted to film in its entirety, but gosh I get tired of answering questions about myself over and over. Naturally, people are curious, but who I am and where I’m from is complicated, and I try to be honest, but in the field, I’m always the stranger among people that have lived with each other since their birth, until, as I just witnessed, their death.

I started this post a few days ago, but today, I’m back home at Kaïra Kunda. In hindsight I see the fatigue of fieldwork makes the comfort and familiarity of home all the more worth it. I’m also thinking how thankful I am to have the opportunity to work and live in this sanctuary that is Senegal.

A very Happy and Merry Christmas to all. Momo and I are here waving at you across the seas.

Fairy Tales

As mentioned in the previous post, animal tales are an integral part of growing up in a West Africa, just as are fairy tales to those of who grew up in The States. While it’s clear that Uncle Remus and Brer Rabbit stories have their origin in Africa, there are other, more subtle parallels between fairy tales and animal stories. For example, see if this story reminds you of anything:

“Once upon a time,

there existed a great forest realm

known as The Kingdom.

The path to The Kingdom was paved in crushed sea shells;

the people lived collectively in cylindrical stucco houses with thatched roofs open to the heavens.

At the center of The Kingdom, lived a powerful Rain King.

The Rain King ensured that The Kingdom had plentiful harvest.

None of the villages suffered from hunger.

However, the forest encompassed all save for three villages,

the last of which was situated at the land’s end.

There no trees grew save for great Baobabs.

Because the village had no wells, every day, the women of the village went early in the morning to draw water from the pond.

One day, a stranger [depicted in various Senegalese tales as Samba Seytani] paddled across the river in his hollowed out canoe.

When he arrived he was tired and thirsty.

He stopped at the first concession he encountered and asked for water to drink.

A girl went to draw water from the clay jar.

When she presented him with a full calabash of water,

he sensed a bad odour coming from the water.

When he hesitated to drink, she noticed.

He told her,

“The water smells bad.”

She told him that

the water smells bad because it had been sitting in the clay jar for three weeks.

The man was shocked.

“Why?”, he asked.

The girl explained to him that there was a crocodile who lived in the pond where the women drew water.

Every three weeks,

the village sacrificed a virgin to the crocodile to prevent it from eating them so they could draw water to drink.

“Ok”, he said, “I know what to do.”

So he asked the girl to show him the pond.

When they arrived at the pond, the man removed his shoes.

Then he loaded a rifle with bullets and shot the crocodile.

And then he cut a piece of the crocodile’s hide.

The next day, the village came with their sacrifice for the crocodile.

When they arrived at the pond, they saw the crocodile.

The virgin to be sacrificed climbed on top of the crocodile.

She said, “it’s dead!”

The King asked,

“Who killed the crocodile?”

Each young man in the village claimed it was he who killed the crocodile.

Then The King saw the shoes.

He asked the population,

“Whose shoes are these?”

The girl who had accompanied the man who shot the crocodile said that the shoes belonged to he who killed the crocodile.

Therefore, The King instructed each young man of the village to try on the shoes.

Each young man of the village tried and failed.

Finally the stranger to the village tried on the shoes.

They fit.

The man showed The King the son of the crocodile.

The King made the man a Prince and he was wed to the virgin the following day.

The people lived happily ever after.”

Visa Vistas (part two)

If I thought moving to England was stressful, that was just a warm-up for this move to France. No joke, if you look up the word bureaucracy in the dictionary…well the French invented the word so there you go. I write this in hopes that it may help someone in the future avoid some of my mistakes, and I realize that  everyone has their own experience so some people may have no troubles at all.

Getting a job in Paris is a dream come true, especially at the world-renowned LLACAN laboratory of CNRS. And luckily, since President Macron has mad love for American scientists, we even get to apply for a special << Talent Visa >> . Further, from spending so much time in West Africa, I have some language skills. I often think about how hard it would have been to get to where I am now if I were coming from a different background. That being said, this move was like being reborn, or what I imagine it is like to enter the world anew.

The first step was for CNRS to create a document called the “Hosting Agreement” which is signed by the lab’s director, and then sent to the prefecture of the suburb in Paris where the lab is located. The prefecture sends the document back to CNRS, who then send it to me. Naturally, this takes months. By the time I got the paperwork, I was just about ready to travel to Senegal. Further, you can’t just rock up at the French consulate in D.C., nor can you send your paperwork for visa processing like most countries. Instead, you have to book an appointment according to your area of the US and then travel to that city (for me it was Atlanta) with all your documents and submit them there. However, these smaller consulates cannot actually give you a visa; they must then submit your application to D.C. But, again, you cannot actually submit your application to D.C. directly. Of course, this meant there was not enough time to process my visa application before I was due to leave but I tried anyways. They told me, after waiting several hours with other applicants, that I was crazy to even think they could give me my visa in time.

Somewhat serendipitously, I got sick and could not travel to Senegal, so I booked another appointment at the consulate in Atlanta. However, because I then fell into the August rush of students applying for visas to study abroad in France, the process is temporarily taken over by VFS Global. As my appointment fell on the second day of the hand-over, it was chaos. My appointment letter told me to go to the consulate, which I did, arriving very early and feeling very prepared except that one is unable to enter the consulate until ten minutes prior the appointment time whereupon they instructed me to go across town as quickly as possible as they were no longer handling visa applications.

After another wait of several hours, I was seen, my application was submitted, and my biometrics were collected. However, I received a call just after hours informing me that I had to return on Monday as they literally lost my biometrics. Third time was the charm I suppose and in fact my visa came in my passport quite soon after in the mail.

Now, remember that Hosting Agreement? Because you cannot submit my documents directly to the consulate in D.C. and by the time I submitted them, I did so via a literal third party, there was no way not to submit the original form. Apparently, however, this form that originates with CNRS, is sent to the prefecture to be signed, is then sent to me, submitted to the consulate, and then was supposed to be stamped and signed by the consulate and sent back to me so I can take it back to CNRS who then submits it back to the prefecture. Maybe the consulate also thought this was absurd because, well, they didn’t send it back. So, although I have a temporary visa in my passport, it is insufficient to obtain the residence card needed to live and work in France, and also to leave the country and re-enter.

Thus, after many failed attempts at contacting anyone in any consulate in America, the fine folks here at LLACAN performed heroic actions and persuaded many higher-ups to intervene on my behalf to convince the prefecture to give me a residence card without the original, signed by the consulate, hosting agreement. HR also got in touch to make me a special appointment, which was great because otherwise you have to stand in multiple lines for hours through various stages of processing. The woman with whom I was meeting even called me while I was on the way to let me know that I need special stamps for (another) special temporary visa I need to leave the country and re-enter because I will not yet have a real residence permit, only a temporary one. The special stamps are to be bought not at the prefecture, but at a Chinese-run tobacco shop. Go figure.

So I got my special stamps and now I have a special visa, a talent visa, and a temporary residence card. I feel very special. I was feeling so special, in fact, that the other night I went out to celebrate. I bought a Navigo travel card for the metro so as to avoid all those little individual tickets and perhaps save some money on travel before I can get my bike here. Although there are two components to the Navigo card, one for travel and one for your identity, I thought the latter was optional to carry. Sadly, I was mistaken and was accosted by three RATP agents who shook me down for €50!

I’ll update this with a really positive experience yesterday. I was getting a bit nervous about my medications expiring before I went to Senegal and I still can’t apply for insurance until I get my resident permit. So I asked what to do, as now I’ve learned that’s the better way than trying to find out on my own, and learned of a fantastic website http://www.doctolib.fr/ where I found a doctor close by with immediate availability, and within no time and very little money, had a new prescription! That could never happen in the US without insurance! So thank you France for the kindness.

LLACAN & ELAN: la deuxième partie

Seems like linguistics these days is all about algorithms and corpora. Yet, because we’re not at the point where we can grep an audio file, a lot of archived raw data is not of much use to us. Further, even a recording’s thorough transcription and translation into a widely understood source language can be tedious to mine for specific instances of morphemes or parts of speech. I’m sure if you’re reading this, I’m preaching to the choir, so I’ll quit with the preamble and get to the point.

Jeffrey Heath is someone who is very generous with sharing his raw data. He, together with the Dogon and Bangime linguistics team, are preparing to add even more texts to those already available on the internet for Dogon, Bangime, Bozo, and Songhay. These Jamsay texts from 2004 are an example of an amazing resource. Every linguist has his or her own style, and I know Jeff prefers to keep his texts monolingual, with separate translations and transcriptions. This method may have its advantages but for those of us here at the Discourse Reporting in African Languages project, our goal is to have an ELAN corpus with a consistent, searchable, tier hierarchy across multiple languages. The Dogon languages are well-known for their logophoric pronouns, so of course we want to use as much Dogon data as we can get. So, how do we convert two separate transcription and translation docx files into ELAN-CorpA eaf’s with our tier structure in a time-efficient enough way to handle a lot of data?

Well, although it may be somewhat of a take two steps back and one forward, the answer seems to be to go through Toolbox. Thus, we convert from Word to plain text, which is then easy to read and merge into a text file Toolbox, configure everything there to our ELAN-CorpA specifications, and then export/import. Here are the steps:

  1. Extract one text at a time, along with its translation, into two files (e.g. Jamsay-Text01.docx, Jamsay-Text01-ENG.docx)
  2. Change the title to title \id, and any subtitles to \ti
  3. Replace (first item with [space] second item):
    A: \spk A^p\tx
    B: \spk B^p\tx
    \spk \ref^p\spk
    \ref ^p\ref
    (, ^p^p\ref^p\spk B^p\tx)
    \r\n (space) \rn (no space)
  4. Save both files as plain text (select UTF-8 encoding)
  5. open in notepad++ select Unicode Encoding (no BOM)
  6. Open in Toolbox as text:
  7. Select tools – Break renumber text – Use this name – Entire Databas
  8. Open in notepad++ change:
    \_sh v3.0  400  Text (open as entire text)
    to:
    \_sh v3.0  400  textRef (one line at a time)
  9. Open again in Toolbox:
    Ensure that both transcription and translation have same number of phrases!
  10. Change \tx in -ENG file to \ft
  11. Open the transcription file in Toolbox:
  12. Select Database > Merge Database (use these settings:)
  13. Screenshot 2018-10-02 09.57.04
    Merge the translation into the transcription
  14. Import transcription file as toolbox file in ELAN using saved markers (available upon request)
  15. Set time to 5000
  16. Follow instructions from previous post for interlinearization 🙂