How to scan documents: Difference between revisions
From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs) |
Brian Wilson (talk | contribs) mNo edit summary |
||
Line 1: | Line 1: | ||
I have a Brother scanner that has a document feed on it. | I have a Brother scanner that has a document feed on it. | ||
It scans a multipage doc and puts the output into a PDF file on my server via FTP. | It scans a multipage doc and puts the output into a PDF file on my server via FTP. | ||
I am trying out the djvu format, it seems like a good way to manage the scanned pages. Compression is very good. | |||
# Scan the odd pages, front to back, resulting in a single PDF file. | # Scan the odd pages, front to back, resulting in a single PDF file. | ||
# Scan the even pages, back to front, resulting in a second PDF file. | # Scan the even pages, back to front, resulting in a second PDF file. | ||
# Convert the 2 PDF documents to | # Convert the 2 PDF documents to DJVU documents, 1 per page. ''mkdir f && pdf2djvu -i f frontpages.pdf'' | ||
# Optionally perform any additional processing on the individual pages, such as image filtering or contrast enhancement. | |||
# Optionally perform any additional processing on the individual pages, such as image | # Perform OCR on the individual page files so they can be searched separately | ||
# Merge the page files into one DJVU bundle, making sure they get into the right order. (QA!) | |||
# Merge the page files into one | |||
== Notes == | |||
Packages under Ubuntu are poppler-utils, psutils. Installing the gscan2pdf package pulled in sundry and various useful things such as tesseract and djvu2pdf. | |||
On the Mac I use the viewer djview-libre which is also available for Linux and Windows. | |||
Revision as of 03:02, 8 November 2009
I have a Brother scanner that has a document feed on it. It scans a multipage doc and puts the output into a PDF file on my server via FTP.
I am trying out the djvu format, it seems like a good way to manage the scanned pages. Compression is very good.
- Scan the odd pages, front to back, resulting in a single PDF file.
- Scan the even pages, back to front, resulting in a second PDF file.
- Convert the 2 PDF documents to DJVU documents, 1 per page. mkdir f && pdf2djvu -i f frontpages.pdf
- Optionally perform any additional processing on the individual pages, such as image filtering or contrast enhancement.
- Perform OCR on the individual page files so they can be searched separately
- Merge the page files into one DJVU bundle, making sure they get into the right order. (QA!)
Notes
Packages under Ubuntu are poppler-utils, psutils. Installing the gscan2pdf package pulled in sundry and various useful things such as tesseract and djvu2pdf.
On the Mac I use the viewer djview-libre which is also available for Linux and Windows.