Youtube transcripts – get useless auto-transcripts replaced with a helpful version!

I’ve just been having a late lunch-break and watching a video on YouTube, as you do. One of my many hobbies is crochet, so I decided to watch this video from ‘Girlybunches’ on how to make a ‘hyperbolic crochet brooch – but don’t worry if you don’t care in the slightest about crochet; that’s not what I’m writing about.

I liked what I saw so I looked for the subscribe button. I admit it should have been hard to miss, being bright red an’all, but I wasn’t wearing my glasses! So I started looking through various available buttons and found one called ‘transcript’. Intriguing! Obviously I had to find out more!

What I found out was that machine transcription has a long way to go!

Here is a short sample of what Olivia from Girlybunches actually said:

I just think the half-treble gives you just a little bit more length, which makes it come out a little bit more. Another point is, make this with, erm, now how can I be polite …

Here’s what the machine transcription thought she said:

I just think home trouble Kyushu district bit mornings which makes it come out a little bit more am another point he’s make meese with I’m know how can be polite

Honestly, it’s all like this, I’m not just picking the worst bits!

Here’s another bit.

And magically … and I will put links down below to my video showing how to do these things. You know, you won’t have to worry about not knowing how to do them ‘cause I’ve shown you. And you just do twelve in the loop …

Or, alternatively, from the machine transcriber:

and magically armpit links temple known to my video showing how much do these things time you know you don’t have to you worry about not my cup Stephen itself option you and you just eat well in the …

Now I don’t really know why YouTube provides transcripts – is it to help with SEO? If so then frankly it probably won’t! What comes from the machine transcriber will probably miss most of your essential keywords! Is it to help people with a hearing impairment? Again, if you look at these examples you’ll see that it probably won’t!

Perhaps it’s confused by Olivia’s English accent? OK, let’s find an American video to compare. Here we go, here’s one by CinnaZilla aka the Delta Quadrant. He’s got an American accent and you can find him here. This is what he actually says:

Hey Youtubers, I am starting to create a series of videos on crochet techniques.

Here’s the machine transcription:

pages firsthand and starting too creatine series videos on appreciate techniques

And so it goes on.

So what’s to be done? Well the good news is that you can replace the terrible machine captions with a quality transcription containing the real words that you say. How? Well, it’s quite  a straightforward process. First, you get your friendly neighbourhood transcriptionists (hopefully us here at Penguin Transcription!) to make a transcript of your file.

Then you go to your channel page on Youtube, choose the video you want, click the bar under ‘edit’ and then choose captions. In the captions page you have the opportunity to upload the transcript that you’ve received from us. Then, once it’s uploaded, you click on the box that says ‘sync’ and Youtube will attempt to sync your voice with the uploaded transcript. It takes a little while, but not too long, for that to happen. And then, hey presto, a transcript that’s actually useful!

Focus Group recording – top ten tips

If you’re planning to record focus groups and get them transcribed later on, here are some things you may find it useful to consider before starting your recording:

  • Check with the participants before the focus group starts that they do not mind being recorded for later transcription. Do this well in advance as if one person objects you may have to abandon the recording.
  • Conduct explanations about your research and give background information before switching on the recorder, to save on recording time.
  • If you need to have the different speakers identified in the focus group transcription ask each person to introduce him/herself. Just saying their name is not enough. For the transcriber to get a ‘handle’ on the voices, they will need to each say a couple of sentences. Use something linked to your focus group topic. So for example, if your group is about farmers’ experience of vets, ask each farmer to say their name, where they farm and what livestock they keep on the farm.
  • Lay down the ground rules to participants before you start e.g. remind them not to talk over each other as this will cause problems for the transcriptionist.
  • Use an external microphone (or even more than one) on your recorder. Internal mikes are only suitable for dictation (one voice). Ideally, if you have more than four people, use a series of microphones.
  • Record the group in a quiet place. Background noise can drastically reduce the quality of the recording and increase the time taken to transcribe.
  • Make sure you use a recorder that has a facility for transferring files to a PC
  • Use a file format that is compressed, so that it can be transferred over the internet to your transcriptionist
  • Check your recorder is recording before you start the focus group!
  • Do not serve food while recording the group as the noises of eating will obscure participants’ speech.

For more detailed information on focus group recording, or to request a quote, please see our focus group page at Penguin Transcription

Valentines Day – it’s all about love, and we love transcription!

We all love transcription here – which is a bonus as it’s what we do for most of each working day, but it’s rather like dentistry or chiropody – although we love it, it puts a shudder up most people’s spines. Sally at www.wordnerd.co.uk described it as ‘trudging through treacle’, and she is certainly not alone.

So why do love it? I think the number one thing for all of us is variety – one day we could be typing children talking about a local film project and the next day (or the next hour) we could be transcribing blue-chip directors discussing their use of technology or top academics discussing the intricacies of animal parasites. OK, that last one doesn’t sound too appealing, I have to admit, but the point is every single project is different, and within the projects, every single interview is different. What’s not to love?

Well, there are a few things we don’t love – notably poor-quality recordings, badly moderated focus groups, people at meetings who eat and talk at the same time (not pleasant if you’re actually in the room, but way, way worse on a recording) and bits of hardware or software suddenly packing up for no reason.

But even taking all those things into account we all enjoy the opportunity to learn a little bit about so many different things. It's guaranteed never to be boring. And of course we don't spend all day transcribing – just most of it. Recently we had Rory the Penguin in for a photo-shoot for our upcoming newsletter. He enjoyed his visit and participated fully in the life of the office.

Getting the best from a recording for transcription

There are many transcription services available but sometimes an affordable transcription service can seem hard to find. Transcription is not cheap, because it is a lot more involved than copy typing, but that doesn’t mean you can’t find a good deal with a transcription service, and what’s more, by providing good quality recordings you can make the transcription more affordable, as it will take less time to complete.

Here are a few things to consider:

Time Taken to Transcribe

When pricing up your options the most important thing to remember is that it’s just not possible to type as fast as you speak. Even an experienced transcriptionist will be able to average four times as long for a good, clear one-to-one interview – so an hour of recording will take an average of four hours to transcribe. (Industry standards obtained from the Industry Production Standards Guide, published by OBC, Columbus, OH, USA). But a poor quality recording will take much longer. So how can you make sure that your transcript is clear, in order to get an affordable transcription price? Basically, the easier you make the transcription for the transcriptionist, the more likely they are to be able to give you an affordable transcription quote.


First of all, use the best transcription equipment you can afford, and make sure it’s right for your needs. This means that for interviews you should have a recorded with an external microphone rather than one built into the recorder, which is only designed to pick up dictation. For focus groups you should ideally have several microphones so that all participants are audible, and for conferences the speakers should have good microphones and there should also be people in the audience with ‘roving’ microphones to take around to any audience members wanting to ask a question.


Always try to make sure that you are recording in a quiet environment. Open windows can cause big problems unless you have a ‘noise cancelling’ microphone, which many digital ones are these days. So can air conditioning, so if you do have an air conditioning unit in the room try to ensure your speakers are not situated close to it. If conducting interviews by phone, and assuming that you have arranged these in advance (and asked permission to record, of course) then it’s helpful to ask your interviewee to try to make sure they’re in a quiet environment too!


If you are interviewing and you want the names included then it is helpful to spell out your interviewee’s name at the beginning of the recording, before starting the interview, and speak out any information you would like on the transcript header e.g. the date, the job title of your interviewee etc. For conferences a speaker list and also a delegate list, if there will be audience questions, can save the transcriptionist a lot of time in trying to work out names and organisations.

Care with Conversation 

During the interview, unless you need to interrupt in order to take back control of the interview, try not to speak over your interviewee. Often in a normal conversation we say ‘yeah, yeah, yeah’ or ‘right’ or ‘OK’ more to indicate we’re listening than for any other reason. Every time you say that you are likely to be obscuring a much more important word or group of words spoken by your interviewee. And in conferences or panel discussions, if one speaker is giving a talk (i.e. without interruptions, not a discussion) make sure everyone else’s microphones are turned off. I have, in the past, had to mark whole sentences or even paragraphs of a talk as inaudible, because all I could hear were two panel members chatting about their holidays or little Jonny’s operation, and not the speaker!


Most transcriptionists work in a standard format, whether that be tabular, tabbed, interviews shown as initials or full names etc. Again most are happy to work to your specifications, but the standard format might well be cheaper, so think carefully about whether you need something different or not. Find out what the standard format is in advance if it concerns you, and you may be able to adapt it to your needs.


Finally, give some serious thought to whether or not you need a verbatim transcription. Verbatim transcription includes every repeated word, every ‘um’ and ‘erm’, all those ‘filler’ phrases like ‘you know’ and ‘know what I mean’ that may be repeated a hundred times in one interview, and can also include pauses, coughs, throat clearing etc. if required. Needless to say, this takes longer. If the transcriptionist can filter out all this stuff the transcript is quicker. In my company the cheapest level is what we call ‘intelligent verbatim’ which cuts out all these fillers but leaves the rest exactly as it’s spoken. Different transcriptionists work this differently though, so always check when you’re phoning for your quote. You can find detailed information about our editing levels on our website.

There are, of course, occasions when verbatim is required – depending on your topic it might be required for legal reasons, or you might be studying the language. But if you really don’t need it, don’t end up paying for it!


And finally, remember that the cheapest transcription quote might not be the most affordable one in the end. There is an oft-quoted phrase: if you pay peanuts you get monkeys. Will it really be cost-effective to send your hard-won interviews to the cheapest service if what comes back is gobbledygook and you have to go through the whole thing correcting every other word? How much time will you then waste that could have been spent more productively? Recommendation is always the ideal way to find a service, but if no one you know can recommend a transcription service then look for testimonials. A good company with a strong track record should always be able to provide these. If you’re still not sure, ask questions and base your decision on the quality of the answers. Things you might like to ask are: turnaround time (when will you get the transcripts), confidentiality procedures, whether they have experience in your field, what the standard format is etc.

How to successfully get a conference transcribed

A word on timing

The most important piece of advice I would give as a transcriptionist is that if you’re going to have your conference transcribed you should arrange for completion of the transcription before the conference even takes place! Of course you are going to want to send the transcript (or your interpretation of it) out to your speakers and delegates as soon as possible after the conference takes place, but a conference is a significant chunk of work to transcribe.

Let’s take an example of a conference where the talks (and possible workshops etc.) total 5 hours. Even if you have excellent audio recording equipment and supremely clear speakers, with minimal question and answer sessions or workshops (the point of which I will explain in a moment) the time taken to transcribe is going to be four times as long as the recording – so you’re looking at an absolute minimum length of time taken in this example of 20 hours. Twenty hours of work is probably a minimum of three day’s work for one person, and there’s a very good chance it will take longer.

A good, established transcription company, employing fully trained and competent transcriptionists who are able not just to type but also to proof-read and edit, recognise the correct homophones (words that sound the same but are spelt differently), and punctuate English correctly, is probably going to be booked up for at least the next few days, and if you book in your recording before the conference and agree to send it on a certain date, they will be able to turn it around for you much faster.

Audience questions and participation

Question and answer sessions are often tricky because of the range of different voices involved. This applies to the audience but also to a panel if you are having panel sessions.

For audience sessions, make sure you have ‘roving microphones’ that can be carried around the audience, so that questions are actually audible on the recording. A good conference recording set-up, so that your main speakers can be clearly heard, and individual microphones for each member of the panel are also essential. These may well come with the conference venue but make sure you check this in advance!

What your transcriptionist needs from you

Another very useful tip is to provide the transcriptionist with both a speaker list and a delegate list. Then during the conference ask the Chair to ask all delegates to state their name and position before asking the question. The transcriptionist can then refer back to the delegate list to insert the correct spelling into the transcript. The same applies, of course, to speakers, although they don’t need to state their names if you provide an agenda and they are introduced.

It is also very useful to provide the transcriptionist with any supporting material on the conference that you have available as this will help to establish ‘key words’, words that may be not in common usage but particularly relevant to the topic of the conference. A good transcriptionist will also probably be able to search out most unusual words, but this takes extra time, and if you have already provided material to help, time will be saved.

Audio or video

A videoed conference probably won’t add a huge amount to aid the transcriptionist  although if there are large numbers of slides used then it may be helpful; or it could be equally helpful to simply provide a copy of the slide presentation. The disadvantage of sending video files is their size: sound files are large; video files are huge! The larger your file is, the longer it will take for you to upload it to the internet for the transcriptionist to download it, and the longer it takes the more likely you are to lose the internet connection, which means that you’ll probably have to start all over again!

The choices are to either convert the video to audio before sending, or to simply put the video on a DVD and pop it in the post. We can receive video via our file-sending service, but the issues above do still apply, and for a file as long as a conference it might be quicker to use snail mail!

Scientific transcription

I went to another fascinating series of talks at the John Innes Centre last night: ‘nature’s chemical tool kit’. I’ve been a ‘Friend of the John Innes Centre’ ever since I found out such a thing existed, and they always provide excellent, entertaining and ‘accessible’ science. They also very kindly provide a light supper afterwards and a chance to chat.

I was chatting to one of the ladies that worked there and we got talking about transcription… as you do. She commented that as a non-scientist herself she had had a scientific meeting transcribed a year or so ago and had thought it would be wonderful to have all the complex scientific words put in by someone so she didn’t have to worry about it. Unfortunately of course, when the transcript came back all the names of chemical compounds, genes, plants etc. were just left blank or marked as [unknown word]!

This is not all that surprising given that most transcriptionists (though by no means all) come from a secretarial background and won’t necessarily be familiar with scientific terms, and this is an area where we can help! As I have a PhD in biology, I’m already familiar with the basic scientific terms, and even if I’m not familiar with the precise scientific term someone uses, I have enough of a scientific background to know where and how to start looking it up, which is actually one of the most important skills in transcription, to my mind.

Of course, there are things the client can do to ensure a better result from the start.  It’s always helpful to send any slides, PowerPoint presentations, abstracts and publicity material along with the audio, which will provide further clues as to what’s being spoken about.

If you have audience asking questions, as they did last night, then do make sure you have roving microphones. John Innes were very organised about this last night; they had two people with roving mikes in a fairly small auditorium so it was quick and easy to get the microphone over to whoever wanted to ask a question.

Also, if you want the audience members identified (not necessary last night, but it often is in a more formal environment) do make sure you ask them before the questions start, to identify themselves before they ask their question, and then send the delegate list to your transcriptionist so that s/he has the spellings. And remind the chairman that just because he knows it’s ‘Old Corky’ sitting at the back, saying ‘Hello Corky – let’s have your question then’ will not allow the transcriptionist to recognise ‘George Wellington Wells’ on the delegate list! (I’ve had this happen on many any occasion!)

Organising a conference – get the transcription booked in early!

A lot of conference organisers like to have the key note speech, at the very least, transcribed, and many also like to be able to send delegates copies of all the speeches or publish selected parts of the speeches on their websites.

Some people just publish abstracts sent in by the speakers, and that’s fine, but others want what’s really said on the day to be recorded for posterity, complete with panel sessions, audience queries, workshops and so on.

If you’re one of the latter, you might want to take a look at my article about getting your conference transcribed, which can be found here. I have just realised it’s a little out of date – surely there are no conference centres still recording onto cassette tapes? But the rest of the article is still valid. I will get that last bit updated soon!

The key points are:

  • Book your transcription well in advance
  • Use roving microphones if you are having questions from the audience
  • Provide your transcriptional with as much information as possible, including speaker list, delegate list, keyword list (if possible), agenda and general information about the conference’s content

The first point is probably the most important. A good transcription service is unlikely to be sitting waiting for your call and ready to swing into action when you say, ‘I have 20 hours of conference transcription and I need it back tomorrow please’!


Transcribing when a translator is present

We’ve just taken on a big project, bigger in fact than either we or our client realised at first, as he wasn’t sure how many hours of recording he had! All the files have the interviewer, the interviewee and a translator. We only have to transcribe the English – which is a good thing, as between us in the office we only have a handful of French and Spanish, and this is something a bit more exotic!

Should be easy, you’re probably thinking – after all, if the sound file is an hour long you’re probably only transcribing half an hour’s worth! So why, you might very reasonably ask, am I charging this client our standard rate, and not a reduced amount?

Well in fact I have offered a reduced rate if the recordings are really clear and the translator speaks excellent English – but we (my client and I) rather doubt there are any recordings like that! This is inevitable and I am in no way blaming the client, or indeed the translator! Of course in an ideal world, all recordings for transcription would take place in a quiet office space with the windows closed, no air conditioning on (because it can play havoc with the recorder!) and, where a translator is required, the translator speaking immaculate English with no accent, as well as speaking the tongue he’s translating from perfectly.

Unfortunately real life does rather tend to get in the way – and when you’re recording in rural China or in a war zone, or even an oral history at a little railway museum in the UK, all of which are projects we have worked on in the past, the chances of being able to find a nice, quiet office to work in are pretty small. The chances of finding a perfectly bilingual translator are even slighter!

So although we can, in theory, race through the non-English parts and just type the English, in this project the recording quality is quite poor, the translators’ English leaves much to be desired and many of the translators also have strong accents. Also, the nature of conversation between three people means that the discussions are not clearly and neatly divided into English and the other language. Often, while the translator is trying to do his bit in English, the interviewee thinks of something else he wanted to say and interrupts. Sometimes the translator is talking to the interviewee and then quickly throws a few words in English at the interviewer, before replying to the interviewee in the other language. So we really have to keep our ears ‘peeled’ and listen to everything, even though we can only understand half of it!

All these issues mean it’s taking about as long, or sometimes longer, to transcribe as a good quality, all English transcription, so it’s costing about the same. All I can say is it’s a good job that we all like a challenge at Penguin Transcription!

‘Hidden noise’ problems when recording for transcription

Sometimes background noise in a recording is unavoidable, but it should be pretty obvious that it’s going to affect the recording! Examples might be a recording in a train station with lots of announcements and train noise in the background, a very noisy cafe with other conversations going on around you (not to mention coffee machines) or in a room with a bunch of screaming kids. We’ve transcribed all these sorts of recordings – sometimes it’s frustrating but we accept that at times it’s just unavoidable. With noises like this though it comes as no surprise to the researcher when we say, ‘Background noise is a bit of a problem!’

However, there are quite a few ‘hidden’ noises that can also cause problems, and, unless you’re aware of the possibility of them being a problem, it’s likely that you won’t notice them until it’s too late. Some of these are avoidable if you are prepared in advance – but some will fall into the ‘just have to live with it’ category, as above! However, hopefully the ‘hidden noises’ below will provide you with a few extra things to look out for before starting your recording.

  1. Silent ring. A mobile phone, even if set to silent ring, can interfere with the recorder, so that for the period that it’s vibrating or silent ringing the recording is inaudible. If you need to have mobile phones set to silent, we suggest you place them as far as possible from the recorder/microphone.
  2. Taking notes. When you take notes, you will hardly hear the sound of your pen shuffling across the paper, but the recorder will, if you are writing next to it! We have had recordings sent in where the speech is actually inaudible because the interviewee is sitting a bit away from the recorder and the interviewer is sitting almost on it and busily scribbling notes! The simple solution is to make sure the recorder is closer to the interviewee and to make sure that, if you need to take notes, the pad is not very close to the recorder/microphone.
  3. Shuffling papers. Similar to above – if you have paperwork, or anything else for that matter, such as a handbag, right next to the recorder, then shuffling or rustling noises can sound very loud.
  4. Wobbly recorder. Not a common problem, but we have one client who always seems to have it – if the recorder isn’t lying flat and the table or whatever the recorder is sitting on wobbles a bit, the recorder will rock and the sound of it moving will be considerably louder than the sound of the people speaking!
  5. Air conditioning. Air conditioning, while not sounding especially loud in the room, can interfere with some recorders and make the recording useless. DO take a short sample recording if you’re in a room with air conditioning, while the air conditioning is running, and make sure it’s OK. If it isn’t and you can’t change rooms, open the window if possible! The sound of traffic/building works etc. outside isn’t ideal either, and may cause some inaudible sections, but that’s better than an unusable recording!

Choosing recording equipment for your recordings for transcription

The variety and quantity of recording equipment on the market is increasing all the time, and the choice you face when seeking recording equipment can be quite bewildering. To help you make your choice I have written some (very basic) information which is aimed at giving those people who have not done any recording before somewhere to start from, from the point of view of obtaining recordings for transcripts of meetings, focus groups, lectures and dictated notes. I have written a very straightforward article outlining the most important points to bear in mind when choosing your recording method and it can be found here. This article doesn’t cover telephone and conference transcription, as there are a number of special considerations for these types of recording, and indeed a good conference venue will often (though by no means always) have a recording set-up in place.

If you have any queries about what equipment to choose for your specific needs, or any further advice for people choosing equipment,  feel free to comment.