Feature Request - Splitting and recombining an inputted document #449

DragonflyStats · 2022-09-15T16:39:51Z

Hi there - my query relates to dividing / truncating / split the document at a specified part of the document.

Suppose I have a pre-existing wod document, and there are 3 parts (Parts 1, Parts 2 and Parts 3).
I would like to be able to separate the document into the three parts , and recombine Parts 1 and 3 with a new part 2.

Here is some pseudo-code that hopefully expresses the idea



# my_New_Part_2 is already created by {officer}

myDoc <- read_docx("my_existing_doc.docx")

# when the start and end arguments are left blank, it defaults to the start and end of the inputted document

my_Part_1 <- myDoc %>% doc_split(start="", end="piece of text that indentifies the end of Part 1")

my_Part_2 <- myDoc %>% doc_split(
       start="piece of text that indentifies the end of Part 2", 
       end="piece of text that indentifies the end of Part 2"
       )

my_Part_3 <- myDoc %>% doc_split(
       start="piece of text that indentifies the end of Part 3", 
       end=""
       )

myDoc <- doc_combine(my_Part_1, my_New_Part_2, my_Part_3)

print(myDoc)

The rationale here is that Part 2 of the document is too complex to edit with body_replace_text() or may have images that need to be updated. Additionally there is no way of telling where in the document - in terms of sheet number - where Part 2 starts

Update 1

I think this code segment in Stack Overflow might be able to effectively extractmy_Part_3
https://stackoverflow.com/questions/71811129/how-to-subset-text-from-a-word-docx-after-a-matching-phrase/72018891#72018891
If I can get my_Part_1 then I should have an effective solution

Update 2

I am trying the inverse to the solution presented previously to extract out my_Part_1.
It is not working. I think the issue is the error argument needs to be something else.


body_remove_after_cursor <- function(x) {
  tryCatch(
    {
      x <- officer::cursor_forward(x)
      x <- officer::body_remove(x)
      body_remove_after_cursor(x)
    },
    error = function(e) { 
      return(x)
    }
  )
}

The text was updated successfully, but these errors were encountered:

davidgohel · 2022-12-19T08:46:34Z

Hello @DragonflyStats

Sorry for the lack of feedback. We will try, but not sure how :)

I think it is indeed necessary to use a technique on the cursors and body_remove(). But I doubt we can have a subset of a document as in your proposition (we are using R6 and this would probably mean to refactor the whole package).

This is what I have in mind:

my_New_Part_2 could be made with with block_list()
my_New_Part_2 would replace from "piece of text that identifies the end of Part 1" to "piece of text that identifies the end of Part 2"

DragonflyStats changed the title ~~Splitting and recombining an inputted document~~ Feature Request - Splitting and recombining an inputted document Sep 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request - Splitting and recombining an inputted document #449

Feature Request - Splitting and recombining an inputted document #449

DragonflyStats commented Sep 15, 2022 •

edited

Loading

davidgohel commented Dec 19, 2022

Feature Request - Splitting and recombining an inputted document #449

Feature Request - Splitting and recombining an inputted document #449

Comments

DragonflyStats commented Sep 15, 2022 • edited Loading

Update 1

Update 2

davidgohel commented Dec 19, 2022

DragonflyStats commented Sep 15, 2022 •

edited

Loading