How to scan documents: Difference between revisions
From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs) m New page: I have a Brother scanner that has a document feed on it. It scans a multipage doc and puts the output into a PDF file on my server via FTP. # Scan the odd pages, front to back. # Scan the... |
Brian Wilson (talk | contribs) mNo edit summary |
||
Line 2: | Line 2: | ||
It scans a multipage doc and puts the output into a PDF file on my server via FTP. | It scans a multipage doc and puts the output into a PDF file on my server via FTP. | ||
# Scan the odd pages, front to back. | # Scan the odd pages, front to back, resulting in a single PDF file. | ||
# Scan the even pages, back to | # Scan the even pages, back to front, resulting in a second PDF file. | ||
# Convert the PDF documents to PS documents | # Convert the 2 PDF documents to 2 PS documents. ''pdftops infile.pdf outfile.ps'' | ||
# Split the PS documents into separate files, one page per file | # Split the PS documents into separate files, one page per file | ||
# Optionally perform any additional processing on the individual pages, such as image compression | # Optionally perform any additional processing on the individual pages, such as image compression | ||
Line 14: | Line 14: | ||
# Convert the merged document back into PDF document | # Convert the merged document back into PDF document | ||
# Perform OCR on the PDF doc | # Perform OCR on the PDF doc | ||
Notes: | |||
Commands with '2' like 'pdf2ps' are from the ghostscript package. | |||
Commands with 'to' like 'pdftops' are from the poppler-utils package. | |||
I am not sure if tehre are any advantages to use one or the other when there are equivalent commands (for example 'pdf2ps' versus 'pdftops') |
Revision as of 19:43, 7 November 2009
I have a Brother scanner that has a document feed on it. It scans a multipage doc and puts the output into a PDF file on my server via FTP.
- Scan the odd pages, front to back, resulting in a single PDF file.
- Scan the even pages, back to front, resulting in a second PDF file.
- Convert the 2 PDF documents to 2 PS documents. pdftops infile.pdf outfile.ps
- Split the PS documents into separate files, one page per file
- Optionally perform any additional processing on the individual pages, such as image compression
- For WEB version
- Perform OCR on the individual page files so they can be searched separately
- Convert individual pages into PNG files for viewing
- Put all pages into a book viewer collection
- Merge the page files into one PS document
- Convert the merged document back into PDF document
- Perform OCR on the PDF doc
Notes: Commands with '2' like 'pdf2ps' are from the ghostscript package. Commands with 'to' like 'pdftops' are from the poppler-utils package. I am not sure if tehre are any advantages to use one or the other when there are equivalent commands (for example 'pdf2ps' versus 'pdftops')