graemecoleman’s posterous

"Double Captioning" - my approach to using captions and subtitles as a means of learning a second language

Through my work with DMAG, I have become more and more attracted to the subject of captioning. However, this interest grew not as a result of my work-based interests within accessibility, but through my own personal experiences of using captions as a novel approach to learning a second (and now third) language. In this blog entry, I'll show you how I've used a few open source, or exceptionally cheap, tools to achieve this, and reflect upon how useful this approach has been for me personally. I should point out that I tend to use Mac OSX-based tools so, if you're a Windows or Linux user, you may have to hunt around for alternatives, but I'm sure they exist. Additionally, as I've been learning Danish (and, more recently, German), I'll be using that language as the basis for the discussion, although I'm pretty sure you can apply the techniques I'll demonstrate to any language you're motivated to spend a bit of time immersing yourself within.

Anyone who knows me will be aware that I've been learning Danish for 5-6 years now, albeit at a very slow pace - I guess I'm OK at reading it (with a little dictionary at hand just in case), but I'm much less confident comprehending spoken Danish, and even less confident at speaking it to others. There are several reasons behind this. Firstly, I've had to teach myself the language, as there are no evening classes in my area offering face-to-face tuition. Secondly, I find the whole textbook-based learning approach, with the audio supplements spoken by a jobbing actor in a monotonous drone, a bit dull (in fact, on one of the Danish CDs I bought, I seriously thought the actor in question was ready to top himself - the speaking clock showed more emotion than this guy). And, finally, I'll admit it - I'm a geek. If I can use technology to solve a problem, I will. Even if it means messing around with it myself and trying things out. The approach I have personally found useful is to learn the language through watching Danish DVDs on my iPod, taking advantage of the Danish captions (for the hard-of-hearing Danes) and the equivalent English subtitles (for non-Danes) and, crucially, through displaying both sets of captions and subtitles on the screen on the same time.

So, why take this approach? Firstly, it fosters a much more mobile and portable way of learning - on a long bus/train journey, I can select a film on my iPod and learn the language at the same time. I don't need to be carrying textbooks and CDs around with me in case I have a sudden urge to practice my listening comprehension. Secondly, it's a much more interesting way of learning - Danish DVDs are, of course, aimed at the Danish market, so you get to hear how real Danes speak, talking about real life subjects at a real pace, rather than Mr "I-went-to-RADA-you-know" explaining how one asks for the price of a banana. You never know, you might also enjoy the film - who couldn't fail to be entranced by a film with the English title, "Sunshine Barry and the Disco Worms"? Of course, if you're completely new to the language, this is perhaps not the best approach - while it isn't a particularly motivating process, running through a few lessons in a textbook will help you with the basics of grammar.

I should point out that I have carried out no research to test out whether or not this is more generally a useful approach for others, although it is something I am hoping to follow up academically at some stage (albeit not using this specific approach, but thinking about captions as tools for language learning more generally). Rather, I came across this approach by accident, and by trial and error. About three years ago, I started buying DVDs from online Danish retailers - which, unfortunately, are extremely expensive (GBP20 per DVD, not including postage). Early on, I ordered possibly the most successful (although not really known internationally) Danish film of recent years, "Den Eneste Ene" - as an aside, it was remade in the UK as the literally translated "The One and Only" starring Patsy Kensit, and set in Newcastle rather than Copenhagen but, predictably, flopped. When the DVD dropped through the post, and given that most of the films I ordered had English subtitles (as my Danish was even ropier back then), I took a look at the subtitles offered - Norwegian, Swedish, Finnish, as well as Danish for the hard of hearing. Guh. That'll be twenty quid down the drain then. That was until I found out that there are various websites such as Open Subtitles whereby enthusiastic hobbyists have created subtitles in their own language and uploaded for others to share. I quickly found English subtitles, hunted around for appropriate conversion and subtitle burning software, and stuck the film onto my iPod. I watched the film several times, finding that I could quickly match the spoken text with the English equivalent. As a result, and whenever I received a DVD through the post, I'd find out ways of working with captions and subtitles as a means of improving my abilities in comprehending the language, until I settled upon the "double subtitles" approach I'll demonstrate here. In this approach, I display the Danish for the hard of hearing captions on the screen at the same time as the English subtitles.

 

An example of double captioning in Den Store Dag

The above picture is a still image from the film Den Store Dag ("The Big Day"). What I'd like to focus upon is the double subtitle feature, although if you wish to focus upon the delectable Louise Mieritz instead, I won't hold it against you. The text at the bottom represent the original captions - these are provided on the DVD for Danes who are hard of hearing. The text at the top is taken from the English subtitles, which also appear on the DVD but (like the Danish captions) also appear at the bottom of the screen, so I had to "grab"; them from the film and move them to the top. The effect is that the viewer hears the Danish being spoken, but can also read how this Danish appears in text, as well as seeing an English translation, all at the same time. I am aware this might appear quite confusing at first, but with a bit of effort you can get used to watching films in this way.

This effect is a little tricky to achieve, so I will run through the process I take.

Stage 1

Convert the film to MP4 (or equivalent), "burning" the captions into the movie file

I use a free tool called Handbrake to convert the DVD to MP4 format. On slower machines, this can be a bit of a drag - on my old iBook G4 with 1Gb memory, a two hour film took two hours to convert. My new Macbook Pro (4 Gb) is much faster, although you still need to give yourself twenty minutes or so to allow for conversion. The key, however, is to make sure you "burn" the Danish captions into the converted file. Handbrake is one of the few tools that will do this for you. Click on the "Audio and Subtitles" tab, then select the captions in the original language - do not select to burn the English subtitles, for reasons I shall explain later. Remember that this might take a while, so go and do something else while you wait for the file to convert.

Burning captions in handbrake

Stage 2

Source English subtitles

There are several ways you can source the English subtitles. Firstly, if they do not come on the DVD, check out Opensubtitles to see if a hobbyist has created them for you. Normally, these will be provided in .srt or .sub format, in which each subtitle/caption is represented alongside its respective timestamp. The second approach is to "grab" the English subtitles from the DVD itself. For this, I use another free tool called D-Subtitler. D-Subtitler uses optimal character recognition (OCR) to identify subtitles on the screen and converts them to text.

Grabbing subtitles in D-Subtitler

To do this, double-click on the DVD icon, and drag the VIDEO_TS folder onto the "Objectif Mac" panel, then select "English" from the available languages. Again, this can take about ten minutes on a decently spec'd Macbook Pro, so give yourself plenty of time. Once the OCR conversion process is complete, you will normally be asked to clarify particular characters that the application has not been able to recognise - again, this can take time, and can be particularly frustrating, as you may have to confirm characters more than once. It will then save the document as a .srt file. Here is where it gets interesting. Despite the best efforts of you and the application, some characters will not have been converted properly - for example, you will find that the number 0 has been replaced by the letter O, the letters I (capital i) and l (small case L) are muddled, and so on. Thankfully, .srt files can be opened by any bog standard text editor, so you can go through the script and make changes where required. To go back to the previous step, this is why you burn the captions for the target language rather than the English subtitles - it is much easier for you to recognise mistakes and correct them in English than it is in the language you are intending to learn.

A third step, while extremely time consuming but also somewhat rewarding (and one that I have carried out on more than one occasion) is to grab the captions for the hard of hearing and translate each individual caption into English. This isn't particularly difficult, as the timestamps should be included within the original file, but obviously it does take up a bit of time and effort. I generally take this approach as a last resort, if I cannot find English subtitles either online or on the DVD.

Stage 3

Merge the English subtitles into the MP4 and export to iPod format

The final tool I use, which is unfortunately not free but extremely useful and which does the job perfectly, is Submerge. For just nine dollars, it's an absolute gem. The website itself explains all the features but, by way of a quick summary, Submerge allows you to burn the English subtitles onto the MP4 file you created earlier, and also to export it to the iPod-friendly .m4v format (as well as many other formats, if you prefer to watch the film on your laptop or other video device). The tool allows me to place the subtitles towards the top of the film, play around with different fonts and colours, and finally render the entire film to my iPod. Submerge effectively works the other way round to D-Subtitler by reconverting the (corrected) English subtitles to images, before burning them onto your MP4 file. Once again, depending upon the spec of your Mac, you should make yourself a tea or a coffee while you wait for Submerge to do its thing.

And that's basically that! You have created a doubly-subtitled movie file that you can stick on your iPod and watch over and over again in an attempt to improve your language skills. I use many more captioning and subtitling tools than I have mentioned here, but the three that I have mentioned are perhaps the most useful. As I stated earlier, I have found this approach a useful way of improving my Danish language skills, albeit after I have put the groundwork into learning about grammar, pronunciation and so on, and I am now at a stage where I'd like to be doing the same thing with German films.

And now, if you'll excuse me, I'm off to drool over Ms Mieritz...er...further improve my Danish.

Loading mentions Retweet

Comments [0]

Testing my first post

So, here we go again. Yet another blog for me. I promised to keep my last one going for more than five minutes and failed. But, since Posterous seems to be the talk of Friendfeed and Twitter, I'm hoping to find out what all the fuss is about.

Loading mentions Retweet

Comments [0]