How to Convert PDFs to Word Documents and Image Files – PCMag AU

Posted: February 19, 2022 at 9:50 pm

Heres a common problem. Someone sends you a PDF file, and you need to make large-scale changes in it, the kind of wholesale editing that's impossible to do in PDF-editor apps like the ones you might use to do light edits to a PDF. How do you convert the PDF into a document that you can edit to add or remove paragraphs, move text from one part of the document to another, and so on?

Heres another problem: You need to convert a PDF into an image file to display on the web, insert into a document, or upload it to a site that only accepts JPGs. This problem is a lot easier to solve than the firstthe answer is at the end of this story.

The only way to make wholesale edits in a PDF file is to transform it into a word-processing document so that you can edit it in Word, Google Docs, or any other word-processing app. What makes it difficult is that the PDF (Portable Document Format) standard, an open standard created by Adobe in the 1990s, is completely incompatible with the DOCX word-processing format thats now standard in Microsoft Word and almost everything else. Dont believe any vendor who claims to make PDF editing as easy as it is in Word. That kind of editing is simply impossible in a PDF file. You have to convert the PDF to a different kind of document first.

Theres no perfect solution to this problem, but there are plenty of good-enough solutions. Which solution you should use depends on the kind of PDF that you need to edit. If the PDF was created from a Windows, Mac, or Linux app by exporting from the app to PDF, then the solution is relatively easy because the text of the PDF is embedded in the PDF file and can be extracted. However, if the PDF was created by scanning or photographing printed text, then the problem is a lot harder, because you need to use OCR (optical character recognition) on the scanned image to extract the text, and that process always risks introducing errors.

If you dont know whether a PDF was created by an app or by a camera or scanner, here's how to find out.

Open the PDF in your default PDF app, such as Edge in Windows 11, Preview in macOS, or Adobe Acrobat Reader. Try to select some text by dragging with the mouse. If you can select text, then the PDF was exported from an appor it has already had OCR applied to it, which is just as good. If you cant select text, then the PDF is scanned and needs to have OCR applied before you can convert it into a Word document.

Lets start with PDFs that dont need OCR, in other words, PDFs with embedded text. The simplest way to convert your PDF is to open it in Microsoft Word and let Word convert the content.

Open it as you would any other file in Word. Launch Word and go to File > Open and select your PDF. Or right-click on the file and select Open With > Microsoft Word. The file will likely take a moment to process and then open as an editable Word document.

This solution works reasonably well, but the content probably won't look exactly right. The content in a PDFs is "fixed" in one position on the page, and the PDF doesnt let you insert or remove paragraphs while preserving the flow of the document as you can in a word-processor.

Here's a list from Microsoft of what may not convert just right:

When you open a PDF in Word, you may see a warning:

"Word will now convert your PDF to an editable Word document," it says. "This may take a while. The resulting Word document will be optimized to allow you to edit the text, so it might not look exactly like the original PDF, especially if the original file contained a lot of graphics."

That said, the graphics will get pulled in, but they may not be exactly where you want them. And your text may end up in text boxes rather than freely flowing through the page. But at least youll have a document that you can work with.

If you want to export it as a PDF when youre done editing, simply use Words Export or Save As menus, and export your document to PDF format.

By the way, this conversion works not only in the Word desktop app but also in the free web app version of Word (found at office.com). You open a PDF in Word Online and it's viewable, but click the "Edit in Word" link and you may see a file conversion warning:

It's followed by another warning about changes to the layout, etc. But the content will be there and editable, even if the look went wonky. Give it a try.

Opening a PDF in Word is only one way to convert PDF files to DOCX format. You may get better results by using PDF-editing apps like Adobe Acrobat DC. In my experience, Acrobat does a better job than anything else of exporting PDFs to Word format.

Open the PDF in Acrobat, choose File/Export To from the menu, and export to Word format. Acrobat does a far better job than Word at sorting out page formatting like headers and footers. Word sometimes mixes up the text in the header with the text of the document, but Acrobat almost always gets it right.

The trouble with Acrobat is that it costs moneybut Adobe offers a free online PDF converter that you can use to get the same results you get from Acrobat.

You can find cheaper PDF software that convert app-created PDFs to DOCX format, but I havent found any that do it as well as Acrobat.

Dozens of other free online PDF conversion sites promise to spit out editable text, but I dont recommend any of them as a place to trust uploading your data. Adobe, however, is well-established enough for me to trust it with ordinary documents, though I wont upload anything that I seriously need to keep secret.

One more free app that I sometimes hear recommended for converting a PDF to Word is Google Docs. The instructions are similar to using Acrobat: Open the file for editing, and then download it in Word format. Every time Ive tried it, though, the results were terrible. Your luck may be better.

Everything Ive written so far focuses on PDFs that were exported from an app so that the text is embedded in the PDF. What can you do about converting PDFs made from a scanner or camera?

Depending on the quality of the scanned image, you may be able to open it in Word, and Words built-in OCR may be able to create editable text. Ive had success with clear single-page images, but Word simply cant handle anything complex, like a scan of a book, and tends to produce an unusable mix of text and images.

In converting scanned images to editable text, Acrobat does a decent job of creating a PDF, but nothing comes close to the power of our Editors' Choice winner for OCR tools, ABBYY FineReader PDF 15 ($199). FineReaders OCR engine is more accurate than anything else Ive tried, and it comes with a unique error-checking feature that works like a spellchecker in a word processor, so you can fix OCR errors before exporting the result. FineReader exports the results in Word, PDF, and other formats, and the resulting files are far more usable than anything else Ive found.

If you still have the original document that was scanned or turned into an image file, you can use a mobile scanning app with OCR to capture and extract the text.

If you need to convert a PDF into an image file, it's a whole lot easier on a Mac than a PC.

On a Mac, simply open the PDF in Preview. Use the File > Export menu and select the image format you want and the options you prefer, and you have your image file.

On Windows, the best no-cost method is to create a free Adobe account with Adobe and then go to cloud.acrobat.com/exportpdf. Drag a PDF onto the window. Go to Convert To > Image > Image Format (JPEG, PNG, or TIFF), and use the slider to select the image quality. Multipage PDFs get converted into separate image files. You can then download a ZIP with the image files.

If you're bothered by privacy concerns and don't want to share your data with Adobe, then you can use many image editors to export PDF to image files. My favorite is XnViewMP, which is free for personal and educational use. When you open a PDF in XnViewMP, you'll probably need to follow the prompts to install the open-source GhostScript app for working with PDF and PostScript files, but you can then use XnViewMP to export a PDF to any standard image format. Keep in mind that all fonts will be converted from scalable TrueType format into bitmap, and small text will look blocky.

If you want fine-tuned export options, any commercial PDF editor can export to image files. FineReader, Acrobat, and PDF-Xchange Editor all work with excellent results and include options to create small files suitable for display on the web, insert into documents, or use anywhere else where PDFs aren't supported or convenient.

Here is the original post:

How to Convert PDFs to Word Documents and Image Files - PCMag AU

Related Posts