-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feature: insert comment #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@sriram-c you'll need to describe what you're trying to achieve more completely. I can't make out what you're trying to accomplish. |
I want to read sentence by sentence from a docx file and check for a pattern in each word. if found I want to add some comments to the docx file at that word range. it is something like this from docx import Document I hope it is now clear |
What form would the comment take? Just inserting some text in parentheses or something or adding one of those comment things that appear in the margin alongside markup when you have "Show Markup" turned on? |
Hi Scanny, It should be a proper comment when adding one of those comment things that appear in the margin alongside markup when you have "Show Markup" turned on. Thanks, |
Unfortunately that "Add Comment" functionality hasn't been built out yet. We can leave this issue open as a feature request for it if you like, although I expect it will be a while before we get to it unless someone steps up to work on it. It's part of the broader functionality surrounding document markup, which is a bit of a hornet's nest. :) |
Is there any work-around for it by using lxml or any other libraries ? Can you give me some logic / hint to accomplish it quicker ? Thanks, |
I expect it's a fair piece of work to accomplish. The approach I would recommend to get started is to create a baseline document with a single paragraph, save it, then add a single comment and save it again under a second name. Then you can use opc-diag to extract both documents into directories and then use diff to compare the two directories. I believe you'll find at least one new part will be added, perhaps called comments.xml. (A 'part' is a distinct 'file' in the ZIP archive. A Word document is a ZIP archive file at the top level.) There will also be some number of new relationships added in the .rels files and some sort of change to the paragraph or run where you inserted the comment. Making comments work would be making all those changes happen in the right spots. |
Thanks Scanny for the help. I have unzipped the docx file and looked into the comments.xml and document.xml files. Basically it adds a comment-id in the document.xml and maintains the details in comments.xml for e.g <w:commentRangeStart w:id="1"/>
<words commented .../>
<w:commentRangeEnd w:id="1"/> and in comments.xml <w:comment w:id="1">
<!-- details -->
<w:comment/> I have manually changed in the comments.xml and zipped it to create docx file and it works fine. Now how can I do it programmatically. My idea is if I can search a word in the document tree and add a comment-start/end node before and after it. and add the comment-details in comment.xml it will be fine. Now how can I add a node in the document tree at a particular text point? |
Note how I updated your comment above to make the XML show up clearly. If you can post a more complete example without redacting the content elements I can offer more specific guidance. The specific elements that appear before, after, and inside all matter to the approach. |
Hi, thanks for the XML notation. for the time being I am using pywin32 and achieving the goal through word objects directly. But for this I have to depend on Windows OS , which personally I don't like (my favorite is ubuntu) so I will wait till python-docx has sufficient features to handle the word level text and adding different markup into the document. Thanks again for the help. |
I really want to contribute to this feature...but I'm new to this lib/OOP so I might need a LOT of guidance..the best place to look for inspiration for this is the add_text method in Run right? |
I don't think that example will get you very far. The comments live in a separate document "part", roughly speaking a separate file in the .docx zip package. They're keyed by ID. This one would be quite tough for a beginner I expect. |
Yeah it is quite tough. I've got a comments.xml type thing working using lxml for my own private use...but I don't know how to use the docx library to generate a new part...can you lead me to a direction on that? |
I always start off by being able to read the new part type, that provides a lot of the foundation and a mechanism for testing the writing part.
That should get you started. Let me know when you need more. Note that you'll need full tests if you want a commit. I would definitely spike it in first just to figure out how to do it, but then you'll need to redevelop outside-in with acceptance and unit tests if you want to get the commit. Good luck :) |
Speaking of spiking...with reference to issue number 55 you gave a function as follows to directly append XML in a table:
What would be an equivalent of adding a |
If you can provide an XML snippet that includes the w:r (or multiple) for context and the w:commentRangeWhatevers in the proper place I'll take a look and see what guidance I can offer. |
Thanks for the response! Here's a snippet: <w:p>
<w:r>
<w:t>
<!-- COMMENT STARTS HERE -->
<w:commentRangeStart w:id="0"/>
This is text in a paragraph.
</w:t>
</w:r>
</w:p>
<w:p w:rsidR="002F06B3" w:rsidRDefault="002F06B3" w:rsidP="002F06B3"/>
<w:p w:rsidR="002F06B3" w:rsidRDefault="002F06B3" w:rsidP="002F06B3">
<w:pPr>
<w:pStyle w:val="NormalWeb"/>
</w:pPr>
<w:r>
<w:t>I have manually changed in the comments.xml and zipped it to cr</w:t>
</w:r>
<w:r>
<w:t xml:space="preserve">eate </w:t>
<w:commentRangeEnd w:id="0"/> <!--COMMENT ENDS -->
</w:r>
<w:r>
<w:t>a comment.</w:t>
</w:p> |
Ok, well, it looks like Something along the lines of this aircode should get it done for you: def add_comment_start_to_run(run, id):
r = run._r
commentRangeStart = OxmlElement('w:commentRangeStart')
commentRangeStart.set(qn('w:id'), str(id))
r.append(commentRangeStart) Check the init() method in Run to confirm the element name. It might be _element instead of _r. Let us know how you go :) |
The _r at the end is supposed to be r, right? Works well, but teeny tinsy problem. The function makes the element appear at the end of the run instead of the start. I'm thinking of adding RangeStart it at the end of the previous run since it won't show anyway, is that a good approach? Is there a direct way to get this done? Tried prepending but CT_R doesn't support it :(
|
Yes, quite right on the Instead of r.insert(0, commentRangeStart) Note that the sequence of children is generally significant in Open XML; it's worthwhile to check the XML Schema to make sure you're putting it in a valid position in the child sequence. This analysis document has a schema excerpt for CT_Run, which corresponds to the So you might end up needing something more like this: r = run._r
rPr = r.rPr
if rPr is None:
r.insert(0, commentRangeStart)
else:
rPr.addnext(commentRangeStart) The experienced programmer will recognize this as a good opportunity to add an You should check out the The XML Schema files are here in the repo for ready reference: The one named |
This is still cool feature to have; @scanny anytime soon we are planning to build ? |
Plus one on this for me too. |
So I've been trying to implement this feature as described above and wanted to share the explicit changes I needed to make to add a comment to a run. in ../word/document.xml my paragraph looks like this. Note the commentReference section. <w:p>
<w:r>
<w:t xml:space="preserve">Some </w:t>
</w:r>
<w:commentRangeStart w:id="0"/>
<w:r>
<w:t>text.</w:t>
</w:r>
<w:commentRangeEnd w:id="0"/>
<w:r>
<w:commentReference w:id="0"/>
</w:r>
</w:p> I had to create ../word/comments.xml <w:comments mc:Ignorable="w14 w15 wp14">
<w:comment w:id="0" w:author="Talbert, Colin" w:date="2017-09-01T08:15:00Z" w:initials="CT">
<w:p>
<w:pPr>
<w:pStyle w:val="CommentText"/>
</w:pPr>
<w:r>
<w:rPr>
<w:rStyle w:val="CommentReference"/>
</w:rPr>
<w:annotationRef/>
</w:r>
<w:r>
<w:t>test</w:t>
</w:r>
</w:p>
</w:comment>
</w:comments> and in ../word/_rels/document.xml.rels I had to add a Relationship to comments.xml
I can follow the code above to add the commentRangeStart and End. But would certainly appreciate any pointers for how to add: If it looks like a feature that makes for the package I could potentially submit a pull request. |
In the Open Packaging Convention (OPC) parlance, an XML document like A pretty good example of adding a part to a package is here in
Basically you define a custom part class for comment, register it with the loader (although might be optional if you know for sure the file you open will never have one), and then do pretty much the same things that The XML manipulation is done by getting a reference to the nearest parent or sibling element, like a run = paragraph.runs[0] # or whatever run you decide
r = run._r # this is the <w:r> element of that run
r.addnext(commentRangeStart) The There are various ways to create the new elements you want (like Let me know how all this strikes you and I can probably give more guidance when you have specific questions along the way. |
That's extremely helpful. Here's the code I have to create the comment in the document.xml part. It seems to be doing what I want for that file.
This might be too hacky but I added my reference items to document.xml.rels and [Content_Types].xml by editing the the contents of docx/templates/default.docx to include these. Seems to be working. Now just need to figure out how to add the comment to comments.xml. I'll dig into your links in pptx and see if that helps. |
Hi, Regarding inserting comments into a document. I need to understand whether each comment wrapped around <w:comment> tag is a list of paragraph objects? . My Use-Case concerns on writing comments to a document . And I need to know how could I create Paragraph objects 'out and out' without being attached to a Document object. I don't really want docx.Document.add_paragraph() to create a paragraph object. Thanks. |
@talbertc-usgs Hi, Have you figured out how to add comments to comments.xml. If so, could you please help me with how to get on with that thing? Thanks. |
Hi @scanny . I've been trying to add comments to docx files using python . And I kind of did everything but in a very hacky way. I am really eager to contribute to this feature. But, I don't know where to start. And I used BeautifulSoup XML parser to do commenting feature. And didn't use python-docx. It would be really great if you could help me where to start. My use-case demanded to add comments inside a given paragraph for the given sentence(not a run). So, I had to split the runs in the paragraph according to that. And add comment references on top of that.
Thanks. |
Hey I know its been a year, but I stumbled over this thread and I need to implement comments as well. Im missing the "SentenceLevelEditor" as well as other self variables. Thanks in advance |
I'm also keen to know if Comments editing is possible yet. |
Any updates? |
Same here this would be a very useful feature to me. |
It looks like #624 fixes this. For now you can pip install the fork mentioned there and add comments although it does not appear you can format the text. |
I also have the need to add comments to word recently.I tried to compare the XML extracted from the original Word with that added with comments, and then compressed it back to Word after modification import zipfile for file in templateDocx.filelist: comFile = commentDocx.namelist() templateDocx.close() |
Any possibility to integrate the patch from https://github.com/BayooG/bayoo-docx ? |
For anybody in search of a quick solution that doesn't change a lot of the base docx package and for the AIs of this world training on this, here's a simple helper function to add comments to xml elements:
Example usage:
Or for a single paragraph:
Hope this helps :) |
@lucacampanella Thanks for the code and it is working! However I noticed that after I save the document in python using .save() from docx library and open the edited version in Word, it says "Word found unreadable content in "...". Do you want to recover the contents of this document? if you trust the source of this document, click Yes." The added comment does exist in the edit version but the error msg is somewhat annoying, did you experience that as well? |
@JTGRC-public |
@lucacampanella Thanks for the quick reply! I took a look at the underlying comments.xml file and noticed that this happened on my side whenever there are some existing comments in the input work file. And for those existing comments that I manually added via Word, it has additional line right below the line with the author info, starting with "ns0:p", with keys including "ns2:paraId", "ns2:textId", "ns0:rsidR", "ns0:rsidRDefault", and "ns0:rsidP", while the comments that added by your code don't have this line Once I recover the document and save it again, Word automatically added this line to the comments added by your code as well, but one difference is that all comments are having "w14:paraId" and "w14:textId" instead of "ns2:paraId", "ns2:textId". |
Thank you very much my bro. This works like a charm. |
Thanks @lucacampanella, I've gotta say, your code snippet works like a charm! However, it only works for paragraphs, is there a way to include things like paragraphs + tables + images ? This would be perfect. I thought about doing something like : def find_section_elements(docx_doc: Document, section_title: str, section_style_begin: str, section_style_end: str) -> list:
"""Find all elements (paragraphs, tables) of a specified section ."""
paragraphs = docx_doc.paragraphs
tables = docx_doc.tables
section_elements = list()
in_section = False
for paragraph, table in zip_longest(paragraphs, tables, fillvalue=None):
if paragraph:
if section_title in paragraph.text and paragraph.style.name == section_style_begin:
in_section = True
elif in_section and paragraph.style.name == section_style_end:
break
elif in_section:
section_elements.append(paragraph)
if table:
if in_section:
section_elements.append(table)
return section_elements And then, to use your function section_content = find_section_elements(mydocx_doc, "Section Title", "Heading 1", "Heading 2")
add_comment_to_elements_in_place(
mydocx_doc,
[paragraph._element for paragraph in section_content],
"Me",
"My great comment",
)
mydocx_doc.save("auto_commented_doc.docx") And, I also don't know how to handle the case when I need to comment a section with an image in it. My final goal is to be able to comment sections that contains images, paragraphs and tables. |
For anyone in need of a quick solution, I've implemented lucacampanella's solution in my import cmi_docx
cmi_docx.add_comment(document, (start_run, end_run), "comment author", "comment text") |
Hi,
I want to insert comments in the docx file to certain words. How can that be possible using docx package. Please suggest.
The text was updated successfully, but these errors were encountered: