The Reality War

Book1

Book2

A new time travel, action-adventure series begins with The Reality War Book1: The Slough of Despond, out now

Kindle  amazon.com (Free*) | amazon.co.uk (Free*) . ePub Kobo (free) | iTunes (free) | Smashwords (free) | Nook (free) | Sony (free)

Paperback  US  | UK   *book1 free on amazon.com 31 Jan 2013. Check price before purchase.

Click here for further details.

Book 2 now available for Kindle amazon.com  | amazon.co.uk and out now  in paperback and ePUB eBook format. Click here for further details.

Aside | Posted on by | Tagged , , , , , | 9 Comments

Amazon v Hachette: Don’t Believe The Spin

timctaylor:

Very interesting tale of what really goes on in the book industry, and of how much internet chatter may actually be influenced by hidden PR campaigns.

Originally posted on David Gaughran:

amazonhachetteThe internet is seething over Amazon’s reported hardball tactics in negotiations with Hachette.

Newspapers and blogs are filled with heated opinion pieces, decrying Amazon’s domination of the book business.

Actual facts are thinner on the ground, however, and if history is any guide, we haven’t heard the full story. Here’s how it started.

In a historical quirk of the trade, publishers and booksellers negotiate co-op deals at the same time as the general agreement to carry titles. (For those who don’t know, co-op is the industry term for preferred in-store placement, such as face-out instead of spine-out, position on end-caps, front tables, window displays, and so on.)

At publishers’ insistence, the same practice has continued in the online and e-book world, namely that negotiations regarding virtual co-op (e.g. high visibility spots on retailer sites) take place at the same time as discussions over general terms and publisher-retailer discounts.

There is a lot…

View original 1,809 more words

Posted in Uncategorized | Leave a comment

Now you can learn the secret of green energy…

TheUltimateGreenEnergy_prodcat_200px_96dpiOne of my mini-collections of YA fantasy and sci-fi stories is free today and tomorrow on Kindle. I wrote them really for my son rather than to make a zillion sales, but you can enjoy them yourself too by downloading for free from http://getbook.at/TUGE

You don’t need a Kindle device; you can read Kindle-format books on PC, Mac, Android and iOS [See here for details]

Getting high up the free download ‘bestseller’ charts is fun, and although it doesn’t make me any money directly, the publicity could help.

Crustias_logo_91px_300dpi_left_padded_TextSo pop over to http://getbook.at/TUGE , grab the goodies, and put a smile on my alter-ego Crustias’s face :-)

TalesFromTheRespoitory_stamp

Posted in free kindle book | Tagged , , , | Leave a comment

Writing tips: how to write robust paragraphs in Word

[In this post, I'm going to address a topic that some of you might consider pretty basic: how to write a paragraph using Microsoft Word. It certainly is fundamental but I would say around 15% of the manuscripts I receive in my capacity of publisher, editor, or formatter, have been written by authors who do not know how to tell Word when a paragraph has finished. With a lot of additional rework and luck, you can just about get away without knowing how to do this if you self-publish a paperback, but if you self-publish an eBook starting with a Word document without properly defined paragraphs then the results will be unreadable – Tim]

The most fundamental task of a word processor is to wrap lines for you automatically.

If you were looking at the paperback book from which this post was extracted, you would see it consists for the most part of paragraphs, each of which is separated from the next by a small gap. Most paragraphs have more than one line of words. In fact, you are reading such a paragraph now.

When you type such a paragraph into Microsoft Word, the correct approach is to keep on typing until you get to the end of the paragraph. Then you tell Word that you’ve finished the paragraph by pressing the Enter key. Then you start typing the next paragraph.

For simplicity I’m using the term ‘Enter key’. You might refer to it as ‘Return’ or ‘Carriage return/ line feed’. The button on my keyboard has a short downward line followed by a longer line to the left that terminates in an arrow. These are all names for the same button. It is very rare for PC/Windows software to distinguish between them, but some Mac software does, including Word.

You must not tap the Enter key until you have finished the paragraph. Inside a paragraph, it is the word processor’s responsibility to automatically decide when a line has finished and so it needs to start the next word on a new line. If you try to do this yourself by hitting the Enter key in the middle of a paragraph, then you are going to have a badly formatted paperback book. And if you use the same Word manuscript as the basis of your eBook edition, that is likely to be even worse. In the latter case, I’m not talking ‘doesn’t quite look professionally formatted’, I mean ‘utterly unreadable, ask for money back and complain about poor quality to Amazon.’

The reason is simple. You might think you are setting a new line at the correct place. But it is dependent upon variables such as font size, margins and page size. As soon as any of these variables change, your new line will be in the wrong place. And with eBooks, all of these variables are completely out of your control.

I’m going to repeat the previous paragraph but I’m going to insert a paragraph break at the end of each line. When I wrote this in Microsoft Word on my computer, the lines appeared to wrap perfectly. I can’t tell precisely how this will look on whatever device you are using to read this post, but I am sure it won’t look good. If you are using a Word document to send to Amazon KDP or Smashwords, or using a Word document as the input to an automated conversion tool, such as Calibre, then this is the kind of result you should expect if you don’t set paragraphs as I’ve explained.

The reason is simple. You might think you are setting a new line at the

correct place. But it is dependent upon font size, margins and page size.

As soon as any of these variables change, your new line will be in the

wrong place. And with eBooks, all of these variables are completely out

of your control.

The screenshot below shows how your document should look.

How your paragraphs SHOULD look

Notice in the Ribbon that I’ve ringed the show/hide button (¶). Setting show/hide on means that I see a paragraph mark at the end of each paragraph and a section break at the end of the page.

The ‘show/hide’ or ‘paragraph mark’ (¶) is properly called a ‘pilcrow’. This blog post has been extracted from the manuscript I wrote for a book. When I needed to enter a pilcrow into the text, for the paperback I added the Pilcrow through ‘Insert Character’ from the Ribbon and picked the ‘Arial Unicode MS’ character set and went hunting for the pilcrow. For the eBook version, readers won’t have the Arial Unicode MS font installed on their readers (unless reading using Kindle Reader for PC), but through the magic of Unicode, if your eBook reader has any font that includes the pilcrow, then your reader should be able swap to that typeface and display the character. Modern eBook devices and tablets have good enough Unicode support to display pilcrows and many thousands of characters beyond, although my Kobo readers Mini isn’t able to use fallback fonts in this way.

Unless you’re writing a book on formatting, you probably won’t need to enter a pilcrow yourself, but I’m using it as an example of how you can get special characters into your book.

Now we’ll see the same text but with the paragraphs broken up. Here I’ve hit the Enter key at the end of each line instead of at the end of each paragraph. Remember, if we turned off the show/hide option (and so hide the pilcrows) both examples would initially look identical. But if we changed page size, margins, font, font size or even our version of Word, the lines would break in the wrong place in the second example but would adjust automatically in the first.

How your paragraphs SHOULD NOT look!

Pilcrows don’t always look the same. The most obvious difference is that sometimes the head is filled in and sometimes not. In the screenshot above, the text uses a font called Palatino Linotype, and for that font the pilcrows are hollow. At the bottom I’ve added three blank lines in another font: Calibri, for which the pilcrow is filled in.

I’ve written so far about using the Enter key before the end of the paragraph. Sometimes people keep pressing the space bar or the tab key for the same effect. This has the same results and causes the same formatting disasters as soon as any of the variables changes (such as font size or margins).

Tim

Follow this link to my other writing and publishing tips

This article was adapted from ‘Format Your Print Book for Createspace: 2nd Edition‘ available now as a Kindle eBook, and as a 296 page paperback:

eBook: amazon.com |  amazon.co.uk

Paperback  amazon.com |  amazon.co.uk

Posted in Writing Tips | Tagged , , , , , , , | Leave a comment

Tips for self-publishers: Typography 103 — Kerning and Spacing

If you look back at the advanced font tab in Microsoft Word, there’s an entry there for Kerning. This isn’t specific to OpenType fonts and is something you should keep an eye on with your titles (for example, chapter headings, book and part title pages). Kerning refers to moving characters closer together to avoid unnecessary gaps. It’s generally a good thing to have kerning set on for headings and titles. For example if you have a capital W followed by a lower case ‘a’, where should the ‘a’ start? With a kerned font the ‘a’ will shelter somewhat under the ‘W’, which looks neater and more professional. With Word up to and including Word 2013, there is no control to fine tune kerning: it is either on or off.

Text_6_Kern_off

Text_5_Kern_on

Back in the advanced font tab, you will see a spacing option. This simply places a gap between characters if expanded or reduces spacing if condensed. Best used for special effects and titles, I’ve given expanded examples below for an idea of how chapter headings might look. I’ve set the first example to have normal spacing, expanded 3 points, and expanded 9 points.

Text_7_Spacing_1

You can also have different settings for each character within the same paragraph. For example:

Text_8_Spacing_2

Here I’ve added a manual line feed between the two lines (Shift + Enter). The first line has normal spacing and the second 6pt expanded, except for the final letter (‘Y’) which has normal spacing (because otherwise the subtitle would be offset from the right-hand margin).

Of course, you could achieve an expanded effect by adding space characters, but if you’re doing this frequently (e.g. with chapter headings) then it is easier and more consistent to apply and change if you are setting expanded characters through a heading style rather than direct formatting.

Follow this link to my other writing and publishing tips

This article was adapted from ‘Format Your Print Book for Createspace: 2nd Edition‘ available now as a Kindle eBook, and as a 296 page paperback:

eBook:  amazon.com |  amazon.co.uk

Paperback   amazon.com |  amazon.co.uk

Posted in Writing Tips | Tagged , , , | Leave a comment

Tips for self-publishers: Typography 102 – OpenType

Special Effects with OpenType

If you look up the fonts on your computer you will see nearly all are defined as Truetype or OpenType (in Windows you do this from the Fonts section of the Control Panel app). For our purposes the difference between the two formats is that OpenType allows font designers standard ways to define fancy variations such as ligatures, Stylistic sets, and number styles. Because they are defined in a standard way, Word 2010 and later can let you use them directly from the Ribbon.

Other than that, the differences between Truetype and OpenType are unimportant for us with one possible exception: older versions of Windows may crash when using certain OpenType fonts in Word (those that use something called ‘Postscript outlines’). Certainly I found OpenType fonts unstable when I was briefly running Word 2010 on Windows XP. I am typing these words in Word 2013 on Windows 8 where OpenType and Truetype both work equally well.

To access these OpenType features you need Word 2010 or later. From the font dialog you’re familiar with, move to the Advanced tab. Word 2013 makes these easier to get to. From the Ribbon click the text effects button from the Font section of the Home menu. Underneath the text effects such as shadow and outline, you get access to the OpenType features.

Accessing OpenType features

Ligatures

Ligatures are combinations of characters that are bound together into a single glyph, such as combining ‘f’ and ‘l’ into a single glyph fl. They can add a little flash if you want old-fashioned ornate titles, but are rarely used. If you want them on, pick a Ligature option other than ‘none’ and Word will automatically substitute a ligature for predefined character combinations.


The ligature that got away.

There is one ligature that is so popular, we no longer know that’s what it is. The Latin word ‘et’ means ‘and’ in English. The et ligature was in common everyday use in Roman times and never went away. Today we call it the ampersand.

Number forms

Number forms are more important because they can cause problems if you don’t realize what they are.

Take these example chapter titles. I’m using Constantia font with default settings.

Text_1_chp76

The weight of the ’7′ and ’6′ is centered consistently and by design, but some people will look at that and think the ’7′ is in the wrong place. I’ve had plenty of beta readers complain that the ’7′ is somehow in a subscript setting and that this is an error. It’s not. It’s how the ‘old-style’ number form is defined for this and several other fonts. In the case of Constantia, old-style is the default number form.

If change the number form from ‘default’ to ‘lining’, I get this:

Text_2_chp76

Now the numbers are vertically aligned at their tops but arguably look less elegant.

Text_3_oldstyle

Text_4_lining

The lining number form is easier to read if you’re using tabulated data. Elsewhere it’s a matter of taste. I’d use old-style for my steampunk, historical adventure, romance novel and lining for my science fiction, non-fiction, or modern-day spy thriller.

Stylistic sets

Font designers can design variations of their fonts that have alternate versions of some or all glyphs. There aren’t many fonts that use this feature. Gabriola from Microsoft is one that does.

Recent versions of the Impact font provided by Microsoft as part of Windows also have stylistic sets, as you can see in this postcard I produced to advertise my Greyhart Press business locally.

Advanced OpenType features in practice

The special way the letters ‘E’ and ‘R’ combine and ‘T’ and ‘H’ are also contextual alternatives. Or at least I think they are. In actual fact, in Word, the contextual alternative box makes no difference for the Impact font; you get the fancy alternatives by picking stylistic sets instead. And yet how each word is styled depends on the letters contain within it.

I think it’s best to consider the distinction loose, dependent on the font designer’s interpretation, and just have fun and play. While playing, though, remember that a little ‘flash’ goes a long way in typography. In my postcard example, I wanted to get across the message that a local publisher from a small town in England, had topped the bestseller charts across the Atlantic in America. That’s why I only used the fancy stylistic sets in the three places that most got that message across, while keeping the other text plain.

Contextual Alternatives

If you tick this box, then the font design can override certain combinations of characters. It is very font-specific and not very common, but an example would be a cursive script (one that looks like handwriting) that replaces certain common words (‘of’ ‘and’ ‘the’) with contextual alternative glyphs designed specifically for use with those words. Put another way, instead of having the glyph for the letters ‘a’, ‘n’, and ‘d’ there might be three special glyphs that put together makes a neater version of ‘and’.

Next time we’ll see kerning and spacing examples.

Follow this link to my other writing and publishing tips

This article was adapted from ‘Format Your Print Book for Createspace: 2nd Edition‘ available now as a Kindle eBook, and as a 296 page paperback:

eBook:  amazon.com |  amazon.co.uk

Paperback   amazon.com |  amazon.co.uk

 

Posted in Writing Tips | Tagged , , , , , , , , , , , | Leave a comment

Tips for self-publishers: Typography 101

We’re going to be talking about fonts in the next few posts. Before we do, it makes sense to introduce a few basic terms and concepts in typography: the science and art of lettering.

Let’s look at a letter

LetterA

Don’t worry, I’m not going all Sesame Street on you. ‘A’ is the only letter we need. We’re interested in typography here, so let me describe that specific letter ‘A’ as it appears in the paperback version of this book.

That character is an upper case letter ‘A’ which is part of a font called Times New Roman. Instead of a ‘character’ I might also refer to it as a glyph, which means one entry in the list of characters and other shapes defined in the font. We’ll see more about glyphs in a moment.

Now let’s look at another glyph.

LetterAItalic

It’s still an ‘A’ and if I’m running Microsoft Word under Windows, then my Font menu in the Ribbon still says it is Times New Roman. However, the definition of the glyph actually comes from a different font file. This time it comes from the Times New Roman Italic font file, a font where the glyph for every letter slants to the right.

The fact that Windows does all this in the background for you is usually a good thing. All you need to know is that the ‘I’ button makes text italic and ‘B’ makes it bold. Not only that, but with most fonts – certainly almost all those that come with Windows and Microsoft Office, the result will be a true italic or bold glyph and not a glyph that has been created on the fly and can look ugly (what’s called a faux glyph). If you’re very unlucky though, this can go wrong as I explain in  Typography 909, a chapter later in my book.

Mac OS works differently from Windows: all the font variants in the font family are presented in font dialogs. You get to see Times New Roman Italic listed as a separate font from Times New Roman, and you get to see Times New Roman Bold, and Bold Italic too.

Under the hood, Macs work the same way as Windows in that if you select some text in Times New Roman font and press the ‘I’ button to make it italic – then this will automatically change the font for that text to Times New Roman Italic. Macs allow the additional option of selecting text and then changing the font of that text to be an italic or bold font, or whatever.

Some handy definitions

People often mix up terms such as ‘typeface’ and ‘fonts’. It can get confusing so I’ve set out below the definitions that I use.

Font — a set of characters (called glyphs). For example, Times New Roman Bold, Times New Roman, and Times New Roman Italic are all separate fonts.

Typeface — a set of related fonts. For example, Times New Roman is a typeface that consists for a normally weighted font, a bold font, an italic font, and a bold-italic font.

‘Typeface family’ — Some typefaces have ‘sibling’ fonts. For example, Deja Vu family has monotype, sans-serif, and serif typefaces. In practice, the difference between typeface and typeface family isn’t always so clear cut. When picking fonts to use, selecting fonts from the same typeface family is one good approach because the fonts will not ‘fight’ each other. Typeface family is not a widely or consistently used term, but the concept is worth knowing.

Faux glyphs — where Microsoft Word (or another app) knows you want a character to be bold, small cap or whatever, but builds the glyph for you by altering the regular glyph rather than picking a font that is designed to be bold, or small cap etc. For almost all general-purpose fonts you will come across, the only faux glyphs you will see in Word will be for small caps. Faux glyphs are much more common with fancy and display fonts.

Next time, we’ll look at OpenType special effects such as ligatures and stylistic sets.

Follow this link to my other writing and publishing tips

This article was adapted from ‘Format Your Print Book for Createspace: 2nd Edition‘ available now as a Kindle eBook, and as a 296 page paperback:

eBook:  amazon.com |  amazon.co.uk

Paperback  amazon.com |  amazon.co.uk

 

Posted in Writing Tips | Tagged , , , , , , , , | 2 Comments

Tips for self-publishers: How to publish your back catalog

I’ve worked with a number of authors who have a back catalog of traditionally printed books for which the rights have now reverted to them. This throws up a number of problems when they wish to republish their books, whether as eBooks, paperbacks or both.

If you have your original Word or other word processing document, then your task is much easier, but take care with copy editing. Most commonly the document file you have is what was sent to the publisher before final copy editing. In other words, you need to go through copy editing again. Even if the bulk of copy editing was contained within your Word document, it’s common for a few last-minute changes to have been made at the publisher’s end.

If your publisher gave you a PDF of the finished book, then this carries its own problems. You can convert a PDF to Word format, but the result is not pretty. Unless you are publishing a paperback of the same trim size (page dimensions) as the publisher’s PDF then this will require a lot of work to knock it into an acceptable state before publishing but still a better result than the last resort: scanning.

Scanning is the most common way to re-publish an old book in my professional experience. Take a paperback, scan the pages using an OCR scanner (Optical Character Recognition), assemble the scans into a single Word document, tidy, format, republish.

This sounds simple, after all most inkjet printers can do OCR scanning these days, but scanning isn’t as easy as it looks, and even the best scanning will leave many difficult-to-spot errors.

The first scanning task is to turn the printed version of your book into a single Word document (or other word processor). The best way is to pay a professional to do this. Google for ‘book scanning’ services in your country and get some quotes. You’re looking here for a service provided by a printing company. If you have the time and patience you can do this yourself with a cheap multi-function printer, but expect the professionals to do a faster and better job of it.

Even a professional job will still be loaded with errors. For example, suppose you have a character called ‘Saul’. The OCR software will have a very hard time telling the difference between ‘Saul’ and ‘Soul’. Most likely you will get a random mixture of both. Your spell checker will not complain about either so that won’t help. Most problems can be identified by reading out aloud or converting to an eBook format and getting your iPad or Kindle or whatever to read to you. But in the case of ‘Saul’ / ‘Soul’ even that won’t help.

needle_epub_cover_reduced

Jeff Noon is one of the authors I worked with on their back catalog. Jeff proof read very thoroughly before I saw the manuscripts, which helped enormously. Click on the image to see Jeff’s new eBook editions.

And a more amateurish job will be laden with spelling errors for you to address. The OCR software will struggle to differentiate between the number ’1′ and the lower-case character ‘l’. ’6′ and ‘b’ may look the same depending on the font. It may decide an opening double quote is actually a superscript ‘m’ and a closing double quote is a superscript ’3′.

The best approach to publishing a scanned in book is to accept right from the beginning that tidying a scanned book is a lengthy task. Use a variety of proofing techniques at the start, and then expect to use beta or proof readers to pick up the few examples you missed.

Here are some techniques to use:

  • Distrust every use of a numeral. Check for every use of the number ’1′. Then search for every use of ’2′ etc.
  • Investigate every single spelling error reported by your word processor (usually shown with a red underline in Word). If you are certain the word is correct but isn’t in Word’s dictionary, click the option to ‘add to dictionary’.
  • Investigate every grammar error (usually shown with a green underline in Word). Yes, I know this is tedious. Word will report scores of grammar errors where you know better. Sometimes when there is an error in your manuscript, Word can’t identify the error directly, but knows something isn’t right and so flags a grammar error. In other words, when the grammar checker finds a genuine error the grammar rule it tells you has been broken is usually nonsense, but if you look deeper into the sentence, there is a real error lurking underneath. Remember the example above of ‘Soul’/ ‘Saul’? There is a good chance that the grammar checker will spot this.
  • Investigate every suggested word. Word 2007 started putting blue squiggly lines under words it thinks you might have mistaken and suggests what you should have used instead. For example it’s and its. With each new edition of Word this seems to get more accurate. You should be looking at these in any case, but if you’re scanning in a book, go through all the blue squiggles now.
  • Get a computer to read the results back. There are various ways to do this. The easiest is on an eReader such as Kindle or iPad/iPhone. Check your manual to see whether your device manages text-to-speech. To transfer your book to your eReader, do this:
    • On your computer, download a free eBook management tool called Calibre.
    • Save your manuscript from Word as html format.
    • From Calibre, Add Book. Browse to the html file you saved and add that.
    • Convert the book to the required format. MOBI for Kindles and ePUB for everything else.
    • Connect your device to your computer using your USB cable.
    • Once Calibre has detected your device, right click the book on your Calibre library screen and ‘send to main memory’ on your device.
    • From Calibre, eject your device.
    • Disconnect your eReader and set your device’s text-to-speech option running.
    • Doing this with Apple tablets and phones doesn’t always work. Apple wants you to do everything through iTunes. I use Dropbox to send myself files to my iPad, but you could email yourself.

The Takeaway from this Post

  • If you have the rights, then taking back control of your back catalog and self-publishing can be tremendously fulfilling.
  • However, don’t underestimate the work required to get your old books up to scratch, especially if you have to scan them in.

Follow this link to my other writing and publishing tips

This article was adapted from ‘Format Your Print Book for Createspace: 2nd Edition‘ available now as a Kindle eBook, and as a 296 page paperback:

eBook:  amazon.com | amazon.co.uk

Paperback   amazon.com |  amazon.co.uk

Posted in Writing Tips | Tagged , , , , , , , | Leave a comment

Grey DeLisle: Interview with a Voice Actress Extraordinaire!

timctaylor:

Wow! Can’t believe I missed Armand getting to interview the marvelous Grey DeLisle, the voice of Daphne Blake in the 21st century.

Originally posted on Inezian's Notes.:

Grey with the extraordinary  cast of characters that she's voiced.

Grey with the extraordinary cast of characters that she’s voiced.

For my third Granite State Comicon interview, I am starstruck to be hosting the interview with voice actress Grey DeLisle! Grey began her entertainment career as a stand up comic and singer (releasing a number of albums), but has more recently turned her talents to voice acting, with a stunning list of credits. If you or your children watch cartoons, chances are good that you’ve seen (or heard) some of her work in shows like Fairly Odd Parents, Scooby Doo, Handy Manny, and the Penguins of Madagascar. She has also done extensive voice work for major video game releases.

Q: For many of us, the careers of voice actors are a bit of mystery. It’s sort of like trapeze artists or cruise ship captains. Obviously, someone does those jobs, but we often wonder: How did they get there in the…

View original 531 more words

Posted in Uncategorized | Leave a comment

Snot Wizards and alien editors. I’ve been interviewed!

Just to prove I do write fiction books too sometimes, I’ll point your way to Inezian’s Notes where I’ve been interviewed about my YA books.  Scooby Doo gets a mention, somewhat unexpectedly, and I think 2000AD comes up at some point too (it usually does). Oh yes, and there’s the snot wizard.

Here’s some Wattpad advertising I’ve dug up to give you a feel for the books. Yes, I know they don’t look exactly slick, but I was just having a little fun.

Green_Tailor_Watty_Advertv2Treasure_Watty_adv_v2 DIG_Watty_advertv2 SnotWizardWattyAdvertv2

 

 

 

Posted in Interviews | Tagged , , | Leave a comment

Kindle support for Unicode pt2: how to use Unicode

Last time I posted about how Kindles can support Unicode, despite rumors to the contrary. This time I’m going to give some practical advice on what Unicode is, and how if you are self-publishing eBooks you can use it in your books.

It’s a fairly long post, so here’s a:

Cut-out-and-keep executive summary

  • Kindle support for Unicode is very good, although weaker with the earlier models (I mean specifically Kindle 1, Kindle 2 and Kindle DX; Kindle 3 support is excellent).
  • You need to state somewhere in the file you upload that it is encoded in UTF-8. I explain how to do this for saving as an html file from word and then uploading to KDP, and also for people working directly with html or xhtml
  • Support for Unicode with ePUB readers is so much weaker than with Kindle devices that if you use Unicode glyphs, you need to assume that they won’t show up for some people.
  • The world of eBooks is moving fast. I wrote this post in December 2013. I expect UTF-8 to remain a dominant encoding system for many years to come, but the idiosyncrasies of Amazon KDP and ePUB support are likely to change more rapidly, so this post will date.

What is Unicode and why should you care?

Since the earliest days of computers and telecommunication there’s been a need for a standard way to encode text. Documents are stored digitally as bits and bytes: numbers essentially. So what number or numbers represent an upper case ‘A’, and what number or numbers represent a dollar symbol? If Computer A wants to send a document to Computer B, then both computers need to agree on the same encoding system, otherwise what looks good on Computer A will look gibberish on Computer B.

What we require is an independent standards body to define the encoding system and the codes within them. One of the important early standards (from the 1960s) was ASCII 127. The coding system was to represent each character as a seven-bit binary number, and the list of which of the resulting 127 possible ‘code points’ corresponded to which character was defined like this (http://www.asciitable.com/ ) With 127 code points, and some used for control characters (like one to ring the bell – ASCII 127 was used with teletypes) there wasn’t room for some common characters. We get upper case and lower case ‘A’ through ‘Z’ but we don’t get any accented characters. We get numbers and basic mathematical operators. We get a dollar symbol, but we don’t get a pound (£) sign. We get a basic ‘typewriter’ apostrophe and quotes, but we don’t get the proper curled versions (what Microsoft calls smart quotes) that have always been the norm in books and magazines.

The need to limit character encoding to 7-bits is a constraint that has long-since become obsolete. So people have naturally wanted more characters. The same problem persists that everyone needs to agree the rules for how those characters should be stored in computer files. ASCII 127 was pretty dominant for many years. Today there are many rival schemes for encoding characters, which is why I’ve no doubt you’ve seen examples where encoding has gone wrong. The most popular standard at the moment is a form of Unicode called UTF-8. Most websites you see are now using UTF-8. So even if you’ve not heard of it, you have definitely used it.

The reason you should care is that if you have any character not in the ASCII 127 character set, then you need to find a way to encode it safely. If the protagonist in your novel speaks some words in Spanish or Polish or Hebrew or whatever, then you need something better than ASCII 127.

It’s not just the threat of getting things wrong, there’s the opportunity too. If you want a hammer and sickle symbol for your Cold War spy thriller use   (U+262D). Try arrow symbols for your pirate map (U+27AA), or a heart for your ‘I love New York’ t-shirt, or perhaps your historical romance needs a fancy scene break character (U+2767) Unicode supplies the answer. (By the way, I’ve double checked, and all those symbols I’ve just mentioned render fine on my Kindles).

You don’t need to know how UTF-8 works in order to make an ebook (or webpage) in UTF-8. Here’s all you need to know about how it works.

Suppose you have a webpage that needs something more than the old ASCII127 characters. Here’s what you do.

Your webpage is written in a coding language called html.

At the top of your html document is a statement that says “I am encoded using UTF-8″. That’s not something human readers get to see; that statement is put in a special place for other software to find.

Now suppose someone opens up your webpage in a browser. That might be on an Android phone, iPad, Mac, PC or something else. Doesn’t matter. The browser looks at your html page and looks for the statement that tells it how you encoded all your characters.

The browser recognizes UTF-8 as one of the encoding systems it understands. Now it can separate out all the characters in your webpage and knows which Unicode code point each one represents.

What’s a Unicode code point? Take a look at the screenshot I showed you in my last post on Unicode.

The screen is listing separate Unicode code points. So the code point for the ‘black suit heart’ symbol I used in the Jack Fish book (with all the ‘I ♥ New York’ T-shirts I mentioned last time) is U+2665.

So when I enter the heart symbol into my html code, it is saved in the file as U+2665. When your browser sees that Unicode code point, it knows it has to go away and look up that code number in the current font file and display the pattern of dots it finds for that code. That pattern of dots is called a glyph, and will almost certainly be defined as a vector graphic, so it is sharp at any size.

Now, as you might imagine, not every font has every Unicode code point defined. For example, I don’t think the Times New Roman font has U+2665 defined. If your web page is set in Times New Roman,  what decent browser software will say to itself is this: “Hmm, I don’t have anything for U+2665 in the current font. I could display an empty box, question mark, or some other gobbledygook. But that’s a last resort. What I’ll do first is check whether I have a fall back font that does know how to display U+2665. I’ll look in my Arial Unicode MS font first, because that has thousands of glyphs (your magic talking browser might try another font, such as Lucida Grande, on a Mac). Ah, yes. There we are!”

That’s really all you need to do: declare that your file is encoded using UTF-8 and hope that browser reading your webpage has each glyph defined in the font you have defined, or in a fall back font.

How your glyph displays depends on how the font designers have decided it should look. I’ve worked a little with Hebrew glyphs in Kindle books and find that there is a big variation for the same Unicode code point. In my own example of the I love NY t-shirts, the Unicode code point is named by (I presume) the Unicode Consortium as ‘black suit heart’. But on Kindle for iPad, the heart actually comes up red. The same Kindle book on a Kindle Fire (a color device) will show the heart symbol as black.

I’ve used a web page as an example here. But eBooks in Kindle or ePUB format are essentially web pages. Each section of the book is an html page (or a variant called xhtml). There is software embedded in your Sony Reader, Nook, Kindle or whatever that tells the device how to display each page, just the same as Chrome, Safari or Internet Explorer tells a computer or other device how to display a web page.

How to insert Unicode characters using Microsoft Word

Open up the character map (Insert | Symbol) and pick your Unicode symbol, then press the insert button at the bottom to put it into your text. One problem, though, check that the little box at bottom-right says ‘Unicode’ or can be set to Unicode. If you click on the little down arrow and Unicode isn’t an option then the symbol isn’t a glyph for a Unicode code point. You can still use the symbol in a paperback if you embed the font in the PDF, but the symbol will come out as gobbledygook in an eBook (unless that same font is embedded in the eBook, something I don’t advise).

To see the most Unicode code point glyphs in the character map, you will want a font specially designed to have glyphs for Unicode. In the screenshot below, I’ve selected Arial Unicode MS, which is provided by Microsoft and available for Windows, and Macs if you have the right Office installations. This has a huge number of Unicode code point glyphs. For Macs without Arial Unicode MS, try Lucida Grande. There are some free Unicode fonts around, though the purpose behind most is to provide fonts for many languages, rather than fancy characters such as heart symbols. Try looking here: http://en.wikipedia.org/wiki/Open-source_Unicode_typefaces

How to make a Kindle book with UTF-8 if you upload a Word or html file to Amazon KDP or some other auto-converter

For this approach, you need to be able to upload an html file to whatever service makes your Kindle book for you. The simplest way to do this is to save your Word document as html (from the Save As… menu in Word) and then upload the resulting html file directly to Amazon KDP (though you’ll need to read my note in a moment if you include images).

When you save to html, you must set the encoding to UTF-8 as in the following screenshot.

Here I am Saving As… and changing the format (Save as type) to Web Page, Filtered (which is a slightly more streamlined version of what you would get if selecting save to html). I click on the ‘Tools’ button right at the bottom and pick ‘Web Options’. Then I pick ‘Encoding’ and Save the document as Unicode (UTF-8). What this does is to put a statement at the top of the html file that says ‘I am encoded using UTF-8′.

I’ve just tested this out myself to double-check it works. I’m writing this post in Word 2013. I’ve saved as html, zipped the result (see next section for why) and uploaded that to Amazon KDP. The result looks great with all my heart symbols and other Unicode fanciness coming out perfectly in the resulting Kindle file.

In fact, here’s a screenshot of my previous blog post saved to html, uploaded to Amazon KDP, downloaded and then sent to my iPad as a Kindle book. I started all this lengthy post about Unicode because I’d read someone post online that Kindle books don’t support Unicode. There’s my Unicode heart symbol to prove that isn’t so.

Html and images

This is going a little off-topic, but I can’t talk about uploading html files without a little explanation about images. If I went through saving the Word document for this post as html and uploading to Amazon KDP, then all the screenshot images will be missing. It’s easy to fix (so long as you aren’t too bothered about image quality).

Suppose you have a Word document called (naturally) MyDoc.docx. If MyDoc contains images, then when you save you will find a file has been created called MyDoc.html. So far, so simple, but Word will also create a subfolder called ‘MyDoc’ and in there it will place compressed versions of your images, saved as separate files and numbered (e.g. image0001.jpg). For Amazon KDP, what you need to do is create a zip file of your html file and the folder of images. Upload that zip file to Amazon KDP and it will look fine. [Here's how to zip on Windows (and don't worry if you don't have Windows7 as it's worked this way for a long time) and on Mac.]

The only problem with this approach is that Word always tries to compress your images when saving to html format. You have some limited control through the ‘Pictures’ tab (to the left of the ‘Encoding’ tab in the Web Options screenshot above) but the normal Word Options setting that allows you to turn off image compression doesn’t apply to saving as html, at least not in my Word 2013. When Amazon KDP builds your Kindle book (or you do it yourself through Kindlegen) then that will compress your images anyway, so it might not make a lot of difference for large images but just be aware that when Word saves to html it quietly changes your images.

How to make a Kindle book with UTF-8 if you code your own html

This is the way I make eBooks.

Html files should have the following in the <head> section

<meta http-equiv=”Content-Type” content=”text/html; charset=utf-8″/>

[this is the statement that Word adds when you save to html and set the encoding to utf-8 in Word]

For xhml files you want

<!–?<?xml version=”1.0″ encoding=”utf-8″ ?>

How to make an ePUB book support UTF-8

I’ve concentrated on Kindle books so far; ePUB format uses xhtml files to store its book content and these files need the encoding statement I’ve just given (  <?xml version=”1.0″ encoding=”utf-8″ ?>)

In general, ePUB books are trickier to give guidance for than Kindle because when it comes down to the fine details, there is much more variation in the way in which the ePUB format is implemented by the various ePUB reader devices and the various firmware versions that sit upon them. I’ve read people suggest that because ePUB is an open standard, all you need to do is write one ePUB file and it is guaranteed to work the same way on every device that can show ePUB books. I’m afraid that is far from the truth, but that’s for another post.

When it comes specifically to Unicode support on ePUB, I find support on my Nook Glow, iPad iBooks, and Adobe Digital Editions is good. My Kobo Mini isn’t so good. In my last post I showed some East European Latin extensions implemented as Unicode UTF-8. In that previous post they looked good on my Kobo Mini. But I was cheating! Here’s another screenshot where the Kobo can’t find the right glyph and gives a box with a cross through, what I call a ‘huh?’ symbol.

So what’s gone wrong?

The problem with the Kobo is that it doesn’t seem capable of working with fall back fonts. If the Kobo comes across a Unicode code point, it looks in the current font to see whether it has a glyph defined for it. If it doesn’t, it gives up. What it doesn’t do is go looking in a fall back font. Which is a shame because Kobos have good Unicode support in their Georgia font, which can display those characters perfectly.

You could try forcing the font to Georgia, but that’s easily overridden by the user.

So when I build eBooks for clients and they have requirements outside of a basic Latin character set, I have a conversation about portability – which basically means how confident we can be that the book will behave as we intend across a range of platforms. In some cases this means we have a higher-spec version of the eBook for the Kindle format, and produce a dumbed-down version for ePUB.

What’s the take-away from this post?

Well, it’s really the same as previous post.

The first is that Kindles do have excellent Unicode support, despite what you might read elsewhere. What’s more, an occasional use of Unicode can lift your book out of the ordinary. If you are coding your Kindle book directly with html, or through a tool such as Sigil, then all you need to do is ensure the encoding is correct in thetag as I’ve shown you. If you upload a doc or html file directly to KDP, then you could try setting the encoding as I’ve suggested in Word’s Save As, or simply make do with basic characters.

The second take-away is to beware of what people post on the internet about how to make eBooks because there is a lot out there that isn’t accurate. Treat whatever you find with suspicion, test your books thoroughly, and try to get multiple opinions. That advice, of course, goes for my posts too, every bit as much as anyone else’s.

Click here for part 1 of my unicode posts

Follow this link to my other writing and publishing tips

 ‘Format Your Print Book for Createspace: 2nd Edition‘ available now as a Kindle eBook, and as a 296 page paperback:

eBook:  amazon.com |  amazon.co.uk

Paperback   amazon.com |  amazon.co.uk

Posted in Writing Tips | Tagged , , , , , | 1 Comment