OLE questions - extracting data from embedded "Package" in word to save to a file

EDN Admin

Well-known member
Joined
Aug 7, 2010
Messages
12,794
Location
In the Machine
I am trying to write code that will extract the embedded objects in a Word document and save the result into another file. I found code that will work with Word and Excel embedded documents (the code activates the embedded object, then uses the
Word.Document.SaveAs).
Here is the code I have for extracting the Word documents. I am limiting the versions of Word being processed to Word.Document.8 for now.
<div style="color:Black;background-color:White; <pre>
<span style="color:Blue; Private <span style="color:Blue; Sub processFileButton_Click(<span style="color:Blue; ByVal sender <span style="color:Blue; As System.Object, <span style="color:Blue; ByVal e <span style="color:Blue; As System.EventArgs) <span style="color:Blue; Handles processFileButton.Click

<span style="color:Blue; Dim oWord <span style="color:Blue; As Word.Application
<span style="color:Blue; Dim oDoc <span style="color:Blue; As Word.Document
<span style="color:Blue; Dim inl <span style="color:Blue; As Word.InlineShape

<span style="color:Blue; Dim embeddedWord <span style="color:Blue; As Word.InlineShape
<span style="color:Blue; Dim wordDocument <span style="color:Blue; As Word.Document

<span style="color:Blue; Dim i <span style="color:Blue; As <span style="color:Blue; Integer
<span style="color:Blue; Dim outputFileName <span style="color:Blue; As <span style="color:Blue; Object


<span style="color:Blue; If fileNameTextBox.Text <= <span style="color:#A31515; "" <span style="color:Blue; Then
MessageBox.Show(<span style="color:#A31515; "File name is required", <span style="color:#A31515; "Error", MessageBoxButtons.OK)
<span style="color:Blue; Exit <span style="color:Blue; Sub
<span style="color:Blue; End <span style="color:Blue; If

<span style="color:Blue; If MessageBox.Show(<span style="color:#A31515; "Press OK to process file:" & fileNameTextBox.Text, <span style="color:#A31515; "Ready to process file?", MessageBoxButtons.OKCancel, MessageBoxIcon.Question, MessageBoxDefaultButton.Button1) = Windows.Forms.DialogResult.Cancel <span style="color:Blue; Then
MessageBox.Show(<span style="color:#A31515; "Process cancelled", <span style="color:#A31515; "Process cancelled.", MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
<span style="color:Blue; Exit <span style="color:Blue; Sub
<span style="color:Blue; End <span style="color:Blue; If


<span style="color:Green; Start Word and open the document template.
oWord = CreateObject(<span style="color:#A31515; "Word.Application")
oWord.Visible = <span style="color:Blue; False

oDoc = oWord.Documents.Open(fileNameTextBox.Text)

i = 0

<span style="color:Green; process each embedded object
<span style="color:Blue; For <span style="color:Blue; Each inl <span style="color:Blue; In oDoc.InlineShapes
i = i + 1

<span style="color:Blue; If oDoc.InlineShapes.Item(i).OLEFormat.ProgID = <span style="color:#A31515; "Word.Document.8" <span style="color:Blue; Then

outputFileName = fileNameTextBox.Text & <span style="color:#A31515; "-embed-word" & i & <span style="color:#A31515; ".doc"

embeddedWord = oDoc.InlineShapes(i)
embeddedWord.OLEFormat.Activate()
wordDocument = oDoc.InlineShapes.Item(i).OLEFormat.Object
wordDocument.SaveAs(outputFileName)

wordDocument.Close()

<span style="color:Blue; End <span style="color:Blue; If

<span style="color:Blue; Next inl

oDoc.Close()
oWord.Quit()

MessageBox.Show(<span style="color:#A31515; "File:" & fileNameTextBox.Text & <span style="color:#A31515; " processed.", <span style="color:#A31515; "File Processed", MessageBoxButtons.OK, MessageBoxIcon.Information)

<span style="color:Blue; End <span style="color:Blue; Sub

[/code]

The problem I now have is extracting other types of files, such as PDF files. Adobe has a SDK for PDF files (that I havent gotten to work yet), but there is another type of embedding for files that I dont know how to handle. If someone embeds
an object that does not have a handler, the object gets embedded with the <span style="font-family:Consolas; font-size:x-small
<span style="font-family:Consolas; font-size:x-small OLEFormat.ProgID = "Package". This also happens to PDF documents if the PC running Word does not have Adobe installed on it.
Ive seen other posts that mention having to get the data from the OleNative stream of the OLE object, but I havent seen any code that will do that.
Does anyone have Visual Basic code that will extract the OleNative stream of the OLE embedded object in Word 2003 (.doc files), and also let me know if there would be a difference between the embedded documents created for other versions of Word (ie,
will it work for .docx files)?
Please let me know if this should be posted in a different forum.

View the full article
 
Back
Top