Category Archives: English language

Context is Key

Context is key

I think one of the main reasons I love proofreading transcripts is that it’s fun ferreting out and changing those odd little mistakes where the sentence someone’s typed makes perfect sense, but in context it’s nonsense. It’s a bit like doing a puzzle really.  Context is so often the key.

It never ceases to amaze me that people providing transcripts sometimes get it so wrong – but I won’t deny that I also make mistakes – that’s why proofreading back through is so important! An example of a mistake I proofread the other day was a colloquialism. Of course not everyone is familiar with all colloquialisms, but I was a bit surprised the transcriber didn’t highlight this one as a query. It went something like this*: ‘I like doing x, but then I like doing y too. X is fun but takes a long time. Y is a bit less fun but it’s quick. It’s swings and roundabouts really.’ The transcriber had put ‘swims and roundabouts’ which at least gave me a chuckle. There’s no excuse for it though – when someone is transcribing this and types ‘swims and roundabouts’ a ‘that’s odd’ flag should automatically start waving in their brain. Then all you have to do is look it up on dear ol’ Google. You immediately get, other than a few references to an Angry Birds theme park that will include ‘a mixture of themed swims and roundabouts…’, a notification saying ‘Did you mean swings and roundabouts?

Then there’s the homonyms of course, or strictly speaking homophones – where words sound identical but are actually spelt differently. The obvious suspects are things like ‘they’re’ and ‘their’, or ‘aloud’ and ‘allowed’, but to be honest I wouldn’t employ anyone who couldn’t manage those! It’s the subtle ones that do still crop up though:

  • ‘It was all together a fine kettle of fish’ is wrong. It should be ‘It was altogether a fine kettle of fish’, because ‘all together’ means various things in one place, but altogether means completely.
  • ‘I was going to brooch the subject’ is nonsense because brooch is a piece of jewellery. The word should be ‘broach’ which means to bring up for discussion.
  • ‘The road was tortuous’ or ‘the road was torturous’? Well, either could potentially make sense. The first one means the road was full of twists and turns and the second one means it was full of pain and suffering.

The only way to know what the third example above should be, if the word itself isn’t clear on the tape after a few listens, is to look at the context. If the speaker goes on to say, ‘I thought if the bends got any tighter it would be quite dangerous’ then suggestion one is a winner, but she says, ‘It was a journey I really didn’t want to make. I knew it was going to be painful before I started,’ then we’re looking at option two.

Another essential part of proofreading is research, generally internet-based, to check on people’s names, or locations mentioned in a transcript. Searching out an obscure village in Thailand, for instance, listening again and again to check, ‘Is that really what he said, or is it wishful thinking on my part, because it’s a name I’ve found?’ And then going back and seeing if he says anything else about the village to give me a clue… context again. Perhaps ‘It was near Chiang Mai’. Heck, the one I found is down in the south and Chiang Mai is up north – start again; but what a sense of achievement when you do track them down!

One has to be a little careful not to waste time though. Perhaps in the case above the interviewer knew exactly where the interviewee was talking about and could have filled in the blank in a second or two! So we always try to fill in the blanks, but if something uncertain then we’ll always flag it up for the interviewer to double-check.

Then there’s bits you can’t quite hear – either the person’s mumbling or the recording isn’t great, or the interview is recorded somewhere noisy and a train went past blowing its’ whistle.  I always like to make a stab at those, although I’ll always highlight them as only possibilities, rather than definite. An example cropped up today. Someone was talking about making a contribution to something, ‘but not very much and very tan-xxx-ly.’ I could hear the ‘tan’ and the ‘ly’ quite clearly but the whole word wasn’t quite clear. Context was key again – she’d only played a roll from the side-lines so the missing word was ‘tangentially’.

So if you’re a novice transcriber reading this, do take on board that context is absolutely vital in this kind of work – and if you’re a potential client, please be assured that all work from Penguin Transcription is transcribed by a small team of experienced and knowledgeable transcribers, and then carefully proofread – taking context into account!

* I can’t use real examples as all our work is treated as strictly confidential.


Youtube transcripts – get useless auto-transcripts replaced with a helpful version!

I’ve just been having a late lunch-break and watching a video on YouTube, as you do. One of my many hobbies is crochet, so I decided to watch this video from ‘Girlybunches’ on how to make a ‘hyperbolic crochet brooch – but don’t worry if you don’t care in the slightest about crochet; that’s not what I’m writing about.

I liked what I saw so I looked for the subscribe button. I admit it should have been hard to miss, being bright red an’all, but I wasn’t wearing my glasses! So I started looking through various available buttons and found one called ‘transcript’. Intriguing! Obviously I had to find out more!

What I found out was that machine transcription has a long way to go!

Here is a short sample of what Olivia from Girlybunches actually said:

I just think the half-treble gives you just a little bit more length, which makes it come out a little bit more. Another point is, make this with, erm, now how can I be polite …

Here’s what the machine transcription thought she said:

I just think home trouble Kyushu district bit mornings which makes it come out a little bit more am another point he’s make meese with I’m know how can be polite

Honestly, it’s all like this, I’m not just picking the worst bits!

Here’s another bit.

And magically … and I will put links down below to my video showing how to do these things. You know, you won’t have to worry about not knowing how to do them ‘cause I’ve shown you. And you just do twelve in the loop …

Or, alternatively, from the machine transcriber:

and magically armpit links temple known to my video showing how much do these things time you know you don’t have to you worry about not my cup Stephen itself option you and you just eat well in the …

Now I don’t really know why YouTube provides transcripts – is it to help with SEO? If so then frankly it probably won’t! What comes from the machine transcriber will probably miss most of your essential keywords! Is it to help people with a hearing impairment? Again, if you look at these examples you’ll see that it probably won’t!

Perhaps it’s confused by Olivia’s English accent? OK, let’s find an American video to compare. Here we go, here’s one by CinnaZilla aka the Delta Quadrant. He’s got an American accent and you can find him here. This is what he actually says:

Hey Youtubers, I am starting to create a series of videos on crochet techniques.

Here’s the machine transcription:

pages firsthand and starting too creatine series videos on appreciate techniques

And so it goes on.

So what’s to be done? Well the good news is that you can replace the terrible machine captions with a quality transcription containing the real words that you say. How? Well, it’s quite  a straightforward process. First, you get your friendly neighbourhood transcriptionists (hopefully us here at Penguin Transcription!) to make a transcript of your file.

Then you go to your channel page on Youtube, choose the video you want, click the bar under ‘edit’ and then choose captions. In the captions page you have the opportunity to upload the transcript that you’ve received from us. Then, once it’s uploaded, you click on the box that says ‘sync’ and Youtube will attempt to sync your voice with the uploaded transcript. It takes a little while, but not too long, for that to happen. And then, hey presto, a transcript that’s actually useful!

We’re looking forward to working with you!

Transcription – offshoring, onshoring, in-housing, outsourcing?

Transcription might seem like an obvious thing to outsource and ‘offshore’. After all, ‘it’s only typing isn’t it? It’s not rocket science?’ And yet ‘onshoring’ has been in the news a lot lately, with both positive and negative slants. On the one hand, onshoring could boost the UK economy; on the other hand, the fact that it now ‘costs roughly the same’ to make noodles in China as it does in the UK, according to the recent news story about Symington’s Noodles bringing noodle production back from China to the UK, is an alarming indictment on the state of the UK economy. But it’s not just that companies feel they can now pay even lower wages to UK staff; it is also the rise an rise of wages in China, exchange rate fluctuations and shipping costs too.

So how does transcription fit into this discussion? Well, another recent argument for onshoring has been quality concerns. And this article about IT onshoring suggests a number of other important concerns too: “…time zone challenges, language and other communication issues, high turnover (up to 40% annually in some cases) in offshore locations, intellectual property and security risks (especially in unregulated countries like China), are just some of the unanticipated issues that have plagued offshore development.”

And a number of of those issues could also affect transcription – the obvious one is language. Unless English is a first language then there is no way that someone can provide top quality ‘general’ transcription i.e. interview transcription services and focus group or meeting transcription services. It is possible (though perhaps doubtful) that they can provide equal quality dictated notes, for example, but a conversation – full of idioms, homonyms, a wide variety of different technical terminologies – no.

So … if I’m suggesting you should keep your transcription ‘onshore’ then what about keeping it in-house? Surely keeping it as local as possible will minimise the problems? Well no, not necessarily. And this is where we come to the ‘just typing, not rocket science’ issue. It’s true – it’s not rocket science, but it does require specialist skills, and even if you’re lucky enough to have access to a secretary or PA who can type, that doesn’t mean they can provide fast, accurate, grammatically correct and readable transcripts from an audio file … and all that on top of their regular workload.

I’m sure it will come as no surprise that I am recommending onshoring and outsourcing, since this is the service that we offer here at Penguin Transcription,  but I think you will agree that the arguments are valid.

Language is Evolving – and the Transcriptionist has to evolve with it!

The way language evolves has become a fascination for me since I started transcribing. I started thinking about this again when @Wordnerdsally on Twitter brought my attention to the latest rebuttal, by  Académie Française, of an English word (hashtag), because it’s damaging French language purity. Will they actually stop the Frenchman (or woman, or child) on the street from using the word? I very much doubt it!

The fundamental problem with the whole idea of the Académie Française as a protector of the purity of the French language is that language is not, and never has been, ‘pure’. It changes over time because it’s living and it evolves; people speak it, younger generations love to twist and turn it and make it their own, waves of immigration bring in new words and change old ones, and language just keeps on changing.

A favourite evolution of a word for me, into something more negative than its original meaning, (pejoration, in linguistic terms, so I’m told) is ‘silly’. In Old English, ‘silly’ meant, of all things, blessed! If you were blessed, I suppose you were naturally thought to be innocent, so the word then started to mean innocent. By the time of Middle English (Chaucer’s era) that had evolved into ‘deserving compassion’. Not quite sure how the link worked there but I suppose if you were innocent of a crime and had been accused of it you would deserve compassion – maybe it evolved that way? Anyway, if you needed compassion, that must mean you were weak, right? Well no, probably not, but that seems to have been the thinking then, so ‘silly’ started to mean ‘weak’. ‘From there it was ‘a short step’, says linguist Professor John McWhorter, to it coming to mean ignorant, and from ignorant it evolved into ‘lacking in good sense’, which is one of its meanings today.

This is probably a very over-simplified description of the evolution of silly, not least because it doesn’t only mean ‘lacking in good sense’. It can also mean frivolous and it can be used to describe objects, not just people. However, it’s an indication of the complexity of language and the difficulties inherent in making sense of it! And making sense of language, as it is spoken, and translating that spoken language to something that makes sense on the page, is really what transcription is all about.

The evolution of silly took place over hundreds of years, but some words change much faster than that, especially in spoken English, rather than the more formal English usually found in writing. An obvious recent example is ‘wicked’. When I was a lass ‘wicked’ meant evil, and to some it still does, but most people would hesitate to use it that way because in the younger generations it has come to mean ‘cool’ which of course when I was a lass meant slightly chilly, and not hip (a joint of the body?), groovy (having lines engraved in it?) or just ‘in with the current style’ (from ‘Urban Dictionary‘).

So a good transcriptionist isn’t ‘only’ someone with a fantastic grasp of English spelling and grammar, but someone with their ‘finger on the pulse’ of current spoken English. Fortunately at Penguin Transcription our transcriptionists are not only experienced, but range considerably in age and background. When that the word ‘mardy’ came up in a transcript I was doing I had no idea the word even existed and kept trying to ‘hear’ another word that would make sense to me, but then a younger colleague listened to it and grasped it straight away. (For those who are just approaching ‘middle age’ like me, or older, it means grumpy, surly or miserable apparently!)