本文介绍了使用Aphace POI与Word文档进行双向的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将一些希伯来语文本添加到Word文档中,并且可以正常工作,但是当我添加标点符号时,会变得凌乱.

这是我运行的代码:

  public static void main(String [] args)引发异常{XWPFDocument document = new XWPFDocument();XWPFParagraph段落= document.createParagraph();段落.setAlignment(ParagraphAlignment.LEFT);//设定RTL方向CTP ctp = paragraph.getCTP();CTPPr ctppr;如果((ctppr = ctp.getPPr())== null){ctppr = ctp.addNewPPr();}ctppr.addNewBidi().setVal(STOnOff.ON);XWPFRun运行=段落.createRun();run.setText(שלוםעולם!");//通过指定名称在特定路径中创建文档File newFile = new File("helloWorld.docx");//将文档插入到newFile尝试 {FileOutputStream输出=新的FileOutputStream(newFile);document.write(输出);output.close();document.close();} catch(Exception e){e.printStackTrace();}} 

这是我得到的"helloWorld.docx":

屏幕截图

这就是它的样子:

屏幕截图

此外,我希望整个文档都是RTL(即使是双向文档),而不仅仅是特定的段落.

感谢帮助!

解决方案

使用双向文本是一个众所周知的问题.感叹号以及空格本身不是从右到左的字符.因此,如果需要,我们需要对其进行标记.右向左标记(RLM) U + 200F .请参见 https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types ./p>

以下代码对我有用:

  import java.io.FileOutputStream;导入org.apache.poi.xwpf.usermodel.*;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;公共类CreateWordRTLParagraph {公共静态void main(String [] args)引发异常{XWPFDocument doc =新的XWPFDocument();XWPFParagraph段落= doc.createParagraph();CTP ctp = paragraph.getCTP();CTPPr ctppr;如果((ctppr = ctp.getPPr())== null)ctppr = ctp.addNewPPr();ctppr.addNewBidi().setVal(STOnOff.ON);XWPFRun运行=段落.createRun();run.setText("שלוםעולם\ u200F!\ u200F");FileOutputStream out =新的FileOutputStream("WordDocument.docx");doc.write(out);out.close();doc.close();}} 

空格和感叹号后请注意 \ u200F 标记.

如果文本行来自文件,则标记单个字符将不是最佳实践.然后,整个文本行应标记为从右到左的文本.为此,我们可以将文本行嵌入到 U + 202E左右重叠(RLO)中,然后嵌入 U + 202C POP方向格式化(PDF).

示例:

  import java.io.File;导入java.io.FileOutputStream;导入java.nio.charset.StandardCharsets;导入java.nio.file.Files;导入org.apache.poi.xwpf.usermodel.*;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;导入java.util.List;公共类CreateWordRTLParagraphsFromFile {公共静态void main(String [] args)引发异常{List< String>行= Files.readAllLines(new File("HebrewTextFile.txt").toPath(),StandardCharsets.UTF_8);XWPFDocument doc =新的XWPFDocument();对于(字符串行:行){XWPFParagraph段落= doc.createParagraph();CTP ctp = paragraph.getCTP();CTPPr ctppr = ctp.getPPr();如果(ctppr == null)ctppr = ctp.addNewPPr();ctppr.addNewBidi().setVal(STOnOff.ON);XWPFRun运行=段落.createRun();run.setText("\ u202E" +行+"\ u202C");}FileOutputStream out =新的FileOutputStream("WordDocument.docx");doc.write(out);out.close();doc.close();}} 

I am trying to add some Hebrew text into a word document and it's work fine but when I add punctuation it's getting messy.

This is the code I run:

public static void main(String[] args) throws Exception {

    XWPFDocument document = new XWPFDocument();
    XWPFParagraph paragraph = document.createParagraph();

    paragraph.setAlignment(ParagraphAlignment.LEFT);

    // make RTL direction
    CTP ctp = paragraph.getCTP();
    CTPPr ctppr;
    if ((ctppr = ctp.getPPr()) == null) {
        ctppr = ctp.addNewPPr();
    }
    ctppr.addNewBidi().setVal(STOnOff.ON);

    XWPFRun run = paragraph.createRun();
    run.setText("שלום עולם !");

    // create the document in the specific path by giving it a name
    File newFile = new File("helloWorld.docx");

    // insert document to newFile
    try {
        FileOutputStream output = new FileOutputStream(newFile);
        document.write(output);
        output.close();
        document.close();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

This is the "helloWorld.docx" I get:

screenshot

And this is how it's need to be:

screenshot

Moreover, I want the whole document to be RTL (even with bidirectional) and not just the specific paragraph.

Thanks for help !

解决方案

That's a well known problem using bidirectional text. The exclamation mark, as well as the space are not right-to-left characters themselves. So we need mark them as such, if needed. The RIGHT-TO-LEFT MARK (RLM) is U+200F. See https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types.

Following code works for me:

import java.io.FileOutputStream;

import org.apache.poi.xwpf.usermodel.*;

import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;

public class CreateWordRTLParagraph {

 public static void main(String[] args) throws Exception {

  XWPFDocument doc= new XWPFDocument();

  XWPFParagraph paragraph = doc.createParagraph();
  CTP ctp = paragraph.getCTP();
  CTPPr ctppr;
  if ((ctppr = ctp.getPPr()) == null) ctppr = ctp.addNewPPr();
  ctppr.addNewBidi().setVal(STOnOff.ON);

  XWPFRun run = paragraph.createRun();
  run.setText("שלום עולם \u200F!\u200F");

  FileOutputStream out = new FileOutputStream("WordDocument.docx");
  doc.write(out);
  out.close();
  doc.close();

 }
}

Note the \u200F mark after space and exclamation mark.

If the text lines are coming from a file, then marking single characters will not be best practice.Then the whole text line should be marked as right-to-left text.To do so we can embed the text lines in a U+202E RIGHT-TO-LEFT OVERRIDE (RLO) followed by a U+202C POP DIRECTIONAL FORMATTING (PDF).

Example:

import java.io.File;
import java.io.FileOutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;

import org.apache.poi.xwpf.usermodel.*;

import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;

import java.util.List;

public class CreateWordRTLParagraphsFromFile {

 public static void main(String[] args) throws Exception {

  List<String> lines = Files.readAllLines(new File("HebrewTextFile.txt").toPath(), StandardCharsets.UTF_8);

  XWPFDocument doc= new XWPFDocument();

  for (String line : lines) {

   XWPFParagraph paragraph = doc.createParagraph();
   CTP ctp = paragraph.getCTP();
   CTPPr ctppr = ctp.getPPr();
   if (ctppr == null) ctppr = ctp.addNewPPr();
   ctppr.addNewBidi().setVal(STOnOff.ON);

   XWPFRun run = paragraph.createRun();
   run.setText("\u202E" + line + "\u202C");

  }

  FileOutputStream out = new FileOutputStream("WordDocument.docx");
  doc.write(out);
  out.close();
  doc.close();

 }
}

这篇关于使用Aphace POI与Word文档进行双向的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-16 15:00