问题描述
我正在尝试将一些希伯来语文本添加到Word文档中,并且可以正常工作,但是当我添加标点符号时,会变得凌乱.
这是我运行的代码:
public static void main(String [] args)引发异常{XWPFDocument document = new XWPFDocument();XWPFParagraph段落= document.createParagraph();段落.setAlignment(ParagraphAlignment.LEFT);//设定RTL方向CTP ctp = paragraph.getCTP();CTPPr ctppr;如果((ctppr = ctp.getPPr())== null){ctppr = ctp.addNewPPr();}ctppr.addNewBidi().setVal(STOnOff.ON);XWPFRun运行=段落.createRun();run.setText(שלוםעולם!");//通过指定名称在特定路径中创建文档File newFile = new File("helloWorld.docx");//将文档插入到newFile尝试 {FileOutputStream输出=新的FileOutputStream(newFile);document.write(输出);output.close();document.close();} catch(Exception e){e.printStackTrace();}}
这是我得到的"helloWorld.docx":
这就是它的样子:
此外,我希望整个文档都是RTL(即使是双向文档),而不仅仅是特定的段落.
感谢帮助!
使用双向文本是一个众所周知的问题.感叹号以及空格本身不是从右到左的字符.因此,如果需要,我们需要对其进行标记.右向左标记(RLM)
是 U + 200F
.请参见 https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types ./p>
以下代码对我有用:
import java.io.FileOutputStream;导入org.apache.poi.xwpf.usermodel.*;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;公共类CreateWordRTLParagraph {公共静态void main(String [] args)引发异常{XWPFDocument doc =新的XWPFDocument();XWPFParagraph段落= doc.createParagraph();CTP ctp = paragraph.getCTP();CTPPr ctppr;如果((ctppr = ctp.getPPr())== null)ctppr = ctp.addNewPPr();ctppr.addNewBidi().setVal(STOnOff.ON);XWPFRun运行=段落.createRun();run.setText("שלוםעולם\ u200F!\ u200F");FileOutputStream out =新的FileOutputStream("WordDocument.docx");doc.write(out);out.close();doc.close();}}
在空格和感叹号后请注意 \ u200F
标记.
如果文本行来自文件,则标记单个字符将不是最佳实践.然后,整个文本行应标记为从右到左的文本.为此,我们可以将文本行嵌入到 U + 202E左右重叠(RLO)
中,然后嵌入 U + 202C POP方向格式化(PDF)
.
示例:
import java.io.File;导入java.io.FileOutputStream;导入java.nio.charset.StandardCharsets;导入java.nio.file.Files;导入org.apache.poi.xwpf.usermodel.*;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;导入org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;导入java.util.List;公共类CreateWordRTLParagraphsFromFile {公共静态void main(String [] args)引发异常{List< String>行= Files.readAllLines(new File("HebrewTextFile.txt").toPath(),StandardCharsets.UTF_8);XWPFDocument doc =新的XWPFDocument();对于(字符串行:行){XWPFParagraph段落= doc.createParagraph();CTP ctp = paragraph.getCTP();CTPPr ctppr = ctp.getPPr();如果(ctppr == null)ctppr = ctp.addNewPPr();ctppr.addNewBidi().setVal(STOnOff.ON);XWPFRun运行=段落.createRun();run.setText("\ u202E" +行+"\ u202C");}FileOutputStream out =新的FileOutputStream("WordDocument.docx");doc.write(out);out.close();doc.close();}}
I am trying to add some Hebrew text into a word document and it's work fine but when I add punctuation it's getting messy.
This is the code I run:
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
paragraph.setAlignment(ParagraphAlignment.LEFT);
// make RTL direction
CTP ctp = paragraph.getCTP();
CTPPr ctppr;
if ((ctppr = ctp.getPPr()) == null) {
ctppr = ctp.addNewPPr();
}
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("שלום עולם !");
// create the document in the specific path by giving it a name
File newFile = new File("helloWorld.docx");
// insert document to newFile
try {
FileOutputStream output = new FileOutputStream(newFile);
document.write(output);
output.close();
document.close();
} catch (Exception e) {
e.printStackTrace();
}
}
This is the "helloWorld.docx" I get:
And this is how it's need to be:
Moreover, I want the whole document to be RTL (even with bidirectional) and not just the specific paragraph.
Thanks for help !
That's a well known problem using bidirectional text. The exclamation mark, as well as the space are not right-to-left characters themselves. So we need mark them as such, if needed. The RIGHT-TO-LEFT MARK (RLM)
is U+200F
. See https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types.
Following code works for me:
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;
public class CreateWordRTLParagraph {
public static void main(String[] args) throws Exception {
XWPFDocument doc= new XWPFDocument();
XWPFParagraph paragraph = doc.createParagraph();
CTP ctp = paragraph.getCTP();
CTPPr ctppr;
if ((ctppr = ctp.getPPr()) == null) ctppr = ctp.addNewPPr();
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("שלום עולם \u200F!\u200F");
FileOutputStream out = new FileOutputStream("WordDocument.docx");
doc.write(out);
out.close();
doc.close();
}
}
Note the \u200F
mark after space and exclamation mark.
If the text lines are coming from a file, then marking single characters will not be best practice.Then the whole text line should be marked as right-to-left text.To do so we can embed the text lines in a U+202E RIGHT-TO-LEFT OVERRIDE (RLO)
followed by a U+202C POP DIRECTIONAL FORMATTING (PDF)
.
Example:
import java.io.File;
import java.io.FileOutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;
import java.util.List;
public class CreateWordRTLParagraphsFromFile {
public static void main(String[] args) throws Exception {
List<String> lines = Files.readAllLines(new File("HebrewTextFile.txt").toPath(), StandardCharsets.UTF_8);
XWPFDocument doc= new XWPFDocument();
for (String line : lines) {
XWPFParagraph paragraph = doc.createParagraph();
CTP ctp = paragraph.getCTP();
CTPPr ctppr = ctp.getPPr();
if (ctppr == null) ctppr = ctp.addNewPPr();
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("\u202E" + line + "\u202C");
}
FileOutputStream out = new FileOutputStream("WordDocument.docx");
doc.write(out);
out.close();
doc.close();
}
}
这篇关于使用Aphace POI与Word文档进行双向的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!