Convert tables in PDF document to tables in Word correctly
The PDF documents contain tables. I tried Adobe Acrobat Professional 9(and few more tools) for the conversion to MS-Word document, but the tables in the resulting Word documents are not converted to tables, but as merged cells so if we want to insert text in them it throws the content in the following cells out of alignment which jumbles up the formatting.
Can I convert the tables from PDF to actual tables in MS-Word which expand when text is inserted in the table rows? I am using Windows XP and Acrobat Professional 9
- 11328 Просмотров
- Метки: нет (добавить)
1. Re: Convert tables in PDF document to tables in Word correctly
Likely the PDF does not have the information needed for the proper conversion. If you put tabs in the pasted table, you should be able to use the table tool in WORD to convert from tabs to table cells.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
2. Re: Convert tables in PDF document to tables in Word correctly
The converted table in Word document contains merged cells which are joined by connectors(line,arrows) in MS-Word. Adding tabs in the converted table, jumbles up the layout of the document.
What did you mean by pasted table?
Thanks for your suggestion and time.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
3. Re: Convert tables in PDF document to tables in Word correctly
I meant to copy and paste. If you have a WORD document, then change the table you have to text and add the tabs between where you want cells and then reconvert to a table. Was easy to do prior to OFFICE 2007, but I have no idea where the conversion is in OFFICE 2007 or 2010.
4. Re: Convert tables in PDF document to tables in Word correctly
Conversion of tables is much better in Acrobat X, where you can also export them to spreadsheets.If you do this a lot I'd suggest downloading the trial and seeing what it does to your documents, but you can't install Acrobat X and Acrobat 9 at the same time on the same computer so it's best if you have a spare.
5. Re: Convert tables in PDF document to tables in Word correctly
I meant to copy and paste. If you have a WORD document, then change the table you have to text and add the tabs between where you want cells and then reconvert to a table. Was easy to do prior to OFFICE 2007, but I have no idea where the conversion is in OFFICE 2007 or 2010.
Thanks for clarifying.
In the MS-Word document, the data is in tables(which are not actual tables, but makeshift created by horizontal/vertical lines). So, I cannot change table to text, add tabs to cells and then reconvert to table.
Thanks for your advice and time.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
6. Re: Convert tables in PDF document to tables in Word correctly
Dave Merchant wrote:
Conversion of tables is much better in Acrobat X, where you can also export them to spreadsheets.If you do this a lot I'd suggest downloading the trial and seeing what it does to your documents, but you can't install Acrobat X and Acrobat 9 at the same time on the same computer so it's best if you have a spare.
Thanks, have to try on a spare computer.
Dave Merchant wrote:
The PDF file is created from Coldfusion. I know I can create RTF documents in Coldfusion which can allevaite the issue, but I don't know if I can create tables in RTF document and haven't had the chance/time to test it out.
The Coldfusion part is as below:
<cfdocument format="pdf" filename="#report_filename_format#" overwrite="yes">
<table border="1" rules="all" cellpadding="5px" style="width:700px">
Do you think the tags in the PDF file are correct? If not, how canI fix it?
Thanks for your time and suggestions.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
7. Re: Convert tables in PDF document to tables in Word correctly
As Dave alluded to, re-purpose of PDF content and accessibility are sub-sets of Tagged PDF.
"Tagged" PDF is similar conceptually to the markup of XML, HTML, etc. but is not the same when implemented.
The source of discussion of Tagged PDF is, currently, ISO 32000-1:2008 (ISO/DIS 14289-1 which is just around the corner will augment ISO 32000).
From what you have posted of the Coldfusion table markup it appears that you have a "layout" table piped into the PDF.
Such a table lacks the requisite table header row and table header cells that are required for a properly Tagged PDF Table element.
I suspect that the source application lacks any PDF Tag management facilities so one would have to perform manual remediation of the PDF using Acrobat Pro. However, "table" content in a PDF that reflect use of table features in an authoring application solely for the purpose of control of information "layout" are invariable a "no-go" for any effective manual remediation.
Without a minimum of a "workable" structure tree (which represents/orchestrates the PDF Tags (markup)) or better still a well-formed structure tree any PDF content export can be akin to a toss of the chicken bones.
If what is developed by Coldfusion can be altered to correctly reflect the discussion of the Table element in ISO 32000-1:2008 then the PDF content might be more amenable to manual Tag cleanup with Acrobat Pro.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
8. Re: Convert tables in PDF document to tables in Word correctly
From what you have posted of the Coldfusion table markup it appears that you have a "layout" table piped into the PDF.
Such a table lacks the requisite table header row and table header cells that are required for a properly Tagged PDF Table element.
I suspect that the source application lacks any PDF Tag management facilities so one would have to perform manual remediation of the PDF using Acrobat Pro. However, "table" content in a PDF that reflect use of table features in an authoring application solely for the purpose of control of information "layout" are invariable a "no-go" for any effective manual remediation.
Which source application you are referring to? The Coldfusion part which generates the PDF or something else?
Without a minimum of a "workable" structure tree (which represents/orchestrates the PDF Tags (markup)) or better still a well-formed structure tree any PDF content export can be akin to a toss of the chicken bones.
If what is developed by Coldfusion can be altered to correctly reflect the discussion of the Table element in ISO 32000-1:2008 then the PDF content might be more amenable to manual Tag cleanup with Acrobat Pro.
Can you point me to some links/examples which show how to generate an actual table element in Coldfusion which the manual tag cleanup of Acrobat Pro can convert properly?
Thanks for your wishes, suggestions and time.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
9. Re: Convert tables in PDF document to tables in Word correctly
"The PDF file is created from Coldfusion." — this makes Coldfusion the "source application".
To review the discussion of what an acceptable Table element, in PDF, is see
Go to section 14.
The Coldfusion "operator" will have to determine if the application has the capability of providing the requisite PDF output that is compliant (e.g., ISO 32000-1 compliant in context of providing Tagged output PDF). If this is possible, particularly with regards to the Table element in a Tagged PDF then you'd have a viable starting point.
To take advantage of what is possible with Tagged PDF, the PDF must be adequately Tagged.
To get this the application used to output the PDF needs adequate tag management as part of its feature set. Failing that PDF content export of untagged or inadequately / inappropriately tagged PDF content yields something of a mish-mash (so, lotsa cleanup).
Back to what Dave M. suggested - use a trial of Acrobat X to see if its improved export of untagged PDF content will serve your needs.
- Мне нравится Показать отметки "Мне нравится" (0) (0)
- Действия
10. Re: Convert tables in PDF document to tables in Word correctly
Sorry for the late response.
I will try out the suggestions of your link and trying a trial of Acrobat X.
Thanks again, I marked your answer as correct as that was most helpful.