Super Practical,How I Efficiently Translated a 65-Page Google Official Prompt Engineering White Paper PDF File

AI Tools posted 1w ago dongdong
8 0

A few days ago, when translating the Google Official Prompt Engineering White Paper PDF (Google Official Prompt Engineering White Paper Complete Translation and PDF Download), I tried some automated methods to improve efficiency. Here, I’d like to share some of my experiences and insights on translating PDFs.

First of all, I personally tend to reject the translation method that focuses on maintaining the original layout. This is because after a PDF is translated, the inconsistent text length often results in an unsightly layout, with text appearing in varying sizes. Additionally, due to layout constraints, words may be forcibly split during translation, which can disrupt the context and ultimately affect the quality of the translation.

When I translate PDFs, I first convert the PDFs into Markdown. Then, I perform the translation based on the Markdown file. After the translation is completed, I regenerate the PDF based on the translated Markdown. This method can preserve text, tables, and images very well. The main drawback is that the layout format is not retained very well. However, the materials I usually translate are mainly text and charts, so this issue has little impact.

How to convert PDF to Markdown?

There are mainly two ways for me to convert PDF to Markdown:

1.Directly use a multimodal large language model to generate Markdown.

Among them, Gemini performs the best. It has strong OCR capabilities and a large context window length. Especially the latest Gemini 2.5 Pro, which delivers excellent results. If you can access AI Studio (aistudio.google.com), there are plenty of free quotas every day, making it almost free of charge. If you are already a Gemini subscriber, it is also very convenient to use Gemini 2.5 Pro on Gemini.

The usage method is very simple. Upload the PDF file. For the prompt words, see below:
> Help me convert this PDF into Markdown and retain all the content without deleting anything.

The advantage of this method is that it is simple and convenient to operate, and the table can be well preserved. However, the drawback is that the PDF file cannot be too large; extraction may not work properly after dozens of pages. In addition, images within the PDF cannot be extracted automatically and need to be manually screenshot or extracted using other tools.

2.I tried using the APIs of third parties, and found that three of them are relatively good:

mineru: mineru.net  Web link
It’s open-source and free, and can be used online. It has illustrations and texts, and based on my testing, it has the best effect. A single document cannot exceed 200 MB and 600 pages, but it’s sufficient.

LlamaIndex’s LlamaParse: Web Link
The advantage is that it has a user interface where you can directly upload a PDF to generate Markdown, and images can also be downloaded separately.
The downside is that the billing method is not flexible, as it only offers monthly subscriptions and does not support pay-as-you-go pricing. Fortunately, the free quota is large enough to analyze many pages.

Mistra’s MistraOCR: Web Link
The advantages are flexible billing options, allowing payment based on usage. It can also generate Markdown and extract images (although I haven’t been able to succeed in doing so).
The disadvantage is that it doesn’t provide a user interface, so you need to write your own code or use open-source projects for assistance.

The advantage of this method is that it can parse PDF files of any size. In addition, the images embedded in the PDF can also be extracted (this doesn’t work for some PDFs).

How to translate Markdown?

Translating Markdown is simple. Just provide the Markdown content you want to translate to your preferred large language model, and add a prompt at the beginning or end like this:

Please rewrite the input content in [target language], keeping the original Markdown format intact without any omissions, and make the content easy to understand.

However, if the Markdown content is too long, it needs to be manually split into chunks, translated part by part, and then manually merged afterward. As for how much text a model can handle at a time, it depends on the model itself. Among them, Gemini 2.5 Pro performs the best in terms of translation length, while GPT-4.5 performs the worst. However, I personally think GPT-4.5 delivers the best translation quality, so many times I prefer to manually split the content and translate it piece by piece using GPT-4.5.

Regarding translation consistency, you can add a glossary to the translation prompt. For example:

Please rewrite the input content in a certain language, respect the original meaning, make it easy to understand for the general public, with no omissions, and do not translate proper names. Glossary:
> AI Agent -> AI Agent
> LLM -> Large Language Model

Or manually replace it after translation.

How to Translate PDF with One Click

The method described above, parsing Markdown and then translating it using Markdown, is a relatively accurate way to translate, but it can be quite cumbersome. If your PDF is not very large, you can also use a large language model to translate it with just one click.

If the content of the PDF file is not long, for example, within 10 pages (the specific number may vary depending on the model, so you may need to try several times), directly ask the model to translate and output in Markdown format.

If the content of a PDF file is relatively long but not overly lengthy, such as the 65-page Google Official Prompt Engineering White Paper I translated, here’s a tip: Use Deep Research to help you translate long PDFs.

Most of them only support using Deep Research to write research reports. People are unaware that Deep Research can actually handle other tasks, such as translation and coding. Thanks to its temporary local storage and the usually long model window, Deep Research is fully capable of handling long-content translations. For instance, a 64-page PDF cannot be translated in a normal conversation, but it can be easily handled within Deep Research.

However, you cannot upload attachments in Deep Research. You need to place the PDF in a publicly accessible location, such as GitHub Pages, S3, etc., and then provide the URL for translation.

Here’s the simple prompt translated into English:
> Please help me fully translate this PDF into Chinese and output it in Markdown format.
> PDF URL: {pdf url}

Deep Research can read and translate the content of PDFs via a browser.

Both OpenAI’s DeepResearch and Google Gemini’s DeepResearch can handle this long PDF translation task, but Gemini’s DeepResearch delivers better translation results. Additionally, Gemini’s results can be directly exported to Google Docs and then downloaded as a PDF. In contrast, for OpenAI’s DeepResearch, you need to copy the content into Markdown, remove some unnecessary reference links, and then export it, which is relatively more troublesome.

Note that translation using DeepResearch also has a length limit. It is still subject to the product’s length restrictions. A 65-page document is already close to the maximum length. For longer documents, it is recommended to split them into multiple smaller PDFs for translation.

© Copyright Notice

Related Posts

No comments yet...

none
No comments yet...