public class DocxService { private static final String CONTENT_TYPE = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"; public InputStream mergeDocx(final List<InputStream> streams) throws Docx4JException, IOException { WordprocessingMLPackage target = null; final File generated = File.createTempFile("generated", ".docx"); int chunkId = 0; Iterator<InputStream> it = streams.iterator(); while (it.hasNext()) { InputStream is = it.next(); if (is != null) { if (target == null) { // Copy first (master) document OutputStream os = new FileOutputStream(generated); os.write(IOUtils.toByteArray(is)); os.close(); target = WordprocessingMLPackage.load(generated); } else { // Attach the others (Alternative input parts) insertDocx(target.getMainDocumentPart(), IOUtils.toByteArray(is), chunkId++); } } } if (target != null) { target.save(generated); return new FileInputStream(generated); } else { return null; } } private static void insertDocx(MainDocumentPart main, byte[] bytes, int chunkId) { try { AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/part" + chunkId + ".docx")); afiPart.setContentType(new ContentType(CONTENT_TYPE)); afiPart.setBinaryData(bytes); Relationship altChunkRel = main.addTargetPart(afiPart); CTAltChunk chunk = Context.getWmlObjectFactory().createCTAltChunk(); chunk.setId(altChunkRel.getId()); main.addObject(chunk); } catch (Exception e) { e.printStackTrace(); } } }Note: Generated file can be opened only in Microsoft Office 2007 or newer (Win/Mac). OpenOffice/LibreOffice render only the master document, ignorind attached ones.
Monday, July 4, 2011
Merge .docx files in Java using docx4j
This code is used in one of the projects I was attached to. The first source file (stream) is used as master document, so, all styles defined in this one will be applied to subdocument (incl. headers and footers).
Labels:
AlternativeFormatInputPart,
docx,
docx4j,
java
Subscribe to:
Post Comments (Atom)
Not work in fedora 13. Empty docx is given as result.
ReplyDeleteThank you for your feedback.
ReplyDeleteI will install Fedora 13 on a VM and will try to reproduce.
Please, send me (barusin@wszib.edu.pl) merged file produced by Your code on Windows (and source files).
ReplyDeleteSorry, my mistake, it works but open office could't open result document. Under Mac and Windows (Office 2011, 2007) it works fine.
ReplyDeleteNo problem. Anyway, I sent You a sample project with updated sources.
ReplyDeleteI will update this post soon too :)
Thanks! Realy good job.
ReplyDeleteThanks for the code but unfortunately I'm unable able to get it to produce any results. I pass the mergeDocx method a list of FileInputStreams that point to the files to be concatenated, then the code runs and seems to process everything but doesn't produce any resultant file and I can't figure out how to make it do so.
ReplyDeleteAny tips? Thanks!
Hi, John.
ReplyDeleteI think the best tip will be a working application. You can download it from http://dl.dropbox.com/u/23122948/docx4jDemo.tar.gz
Import it as maven project in your favorite IDE.
If you have more questions or need some help, don't hesitate to ask, I'll be glad to help.
Hi Stanislaw,
DeleteCan you put again the working application that was at http://dl.dropbox.com/u/23122948/docx4jDemo.tar.gz, please? I have problems with the code above and I want urgent to resolved it, please.
Hi Stanislaw,
ReplyDeleteI managed to find a solution for my problem, luckily it was as simple as replacing lines 29 and 30 with:
SaveToZipFile saver = new SaveToZipFile(target);
saver.save(outputfilepath);
Where 'outputfilepath' is a string pointing to the desired save location.
However I'll still inspect the demo that you've provided as I'm sure I can still learn a thing or two from it. Many thanks for your super quick response and help - brilliant!
Hi John,
DeleteCan you send me, please, a link to download the working demo application /docx4jDemo.tar.gz?
Hello
ReplyDeleteI've tried the example from above for several documents. The resulting document size seems to be approximately equal size to the sum of other documents, but only first one is visible. Do I miss something like page breaks ?
I am using docx4j 2.9.0-SNAPSHOT.jar
Thanks
Hello,
Deleteexcuse me Daniel, where did you find docx4j 2.9.0-SNAPSHOT.jar?
thx
hi,
ReplyDeletein my case it only worked when I removed the line 39.
good work!
Hi Braynner,
DeleteCan you post here, please, a link to download a working demo application? I need it very quickly, please!
Hi Braunner.
ReplyDeleteI'm glad that the snipped was useful for you.
2Anonymous.
ReplyDeleteYou can download the file from this adress:
https://www.dropbox.com/s/w4yrfw979zimzw0/docx4jDemo.tar.gz
Hi, Stanislav,
DeleteThank you very much for the code but unfortunately I have some problem with the resulted file. The code runs and seems to process everything, generate the resulted file that concatenate 2 docx files and when i want to open the resulted file i get the following error: "The file is corrupt and cannot be opened. Location: Part: word/document.xml."
Any idea? It's very urgent to me and I'll appreciate your help. Thank's for all.
I tried to open even your ZResult file that is the resulted file and i get same error: "The file is corrupt and cannot be opened. Location: Part: word/document.xml."
DeleteHow can I solve this problem? Thank's again, Stanislav and I'll wait an answer from you quickly, please.
This is great! Thank you very much!
ReplyDeletesame error me the result file is corrupted ..it said Microsoft 2016 for Mac OS X
ReplyDeleteIs there a way to fix it?