Document File Formats: Difference between revisions
Brian Wilson (talk | contribs) |
Brian Wilson (talk | contribs) |
||
(5 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
TIFF: this is a scanner output format and can be compressed | TIFF: this is a scanner output format and can be compressed | ||
Many people use PDF for document storage. | |||
There is the Adobe XML - based PDF format. See [http://labs.adobe.com/downloads/mars.html Adobe Mars project] | |||
It seems to be going nowhere. | |||
There is the Microsoft XPS format, which is, well, a Microsoft standard and I want to be able to open my documents for many years. Smirk. | |||
# PDF is a doc format, TIFF is more of an image format. Programs to | I wonder what ODF can do? | ||
view PDF's are a little more user-friendly and widely available. | |||
# PDF is a doc format, TIFF is more of an image format. Programs to view PDF's are a little more user-friendly and widely available. | |||
# Both standards are pretty much open and universal. | # Both standards are pretty much open and universal. | ||
# Sizes? | # Sizes? | ||
# | # PDF's can be encrypted. (It's part of the spec.) | ||
[http://en.wikipedia.org/wiki/Portable_Document_Format Wikipedia entry on PDF] | |||
==Source of document: digital== | ==Source of document: digital== | ||
Line 19: | Line 24: | ||
This is mostly dictated by the format of the source file, but I am inclined to think I should settle on a few standards and transcode everything into those formats. | This is mostly dictated by the format of the source file, but I am inclined to think I should settle on a few standards and transcode everything into those formats. | ||
'''audio''': mp3 (yes I know it's a copyrighted format but it's ubiquitous) | '''audio''': '''mp3''' (yes I know it's a copyrighted format but it's ubiquitous) | ||
This will include '''voicemail''' if I ever go over to an Asterisk PBX here at home. | |||
'''photo''': '''jpeg''' or '''tiff''' - General rule: do not transcode TIFF to JPEG, which is lossy. | |||
''' | '''other image files''': '''png''' - Generally I like png because it allows transparency. '''gif''' is for when I want a flying pelican or spinning gears on my loading page! | ||
'''movie''': I have so few movies right now that this is not relevant yet. | '''movie''': I have so few movies right now that this is not relevant yet. | ||
Line 27: | Line 35: | ||
'''text files''': | '''text files''': | ||
I don't want to store formatted text files for long term access in MS-Word format! | I don't want to store formatted text files for long term access in MS-Word format! | ||
What format does OO use? | What format does OO use? 2023 update-- some kind of XML. Fortunately both OO and MS are now cleanly handling each other's formats. | ||
Plain text files should stay that way. | Plain text files should stay that way. | ||
'''email''': I think email should be stored into a MySQL database when it comes in and purged automatically after about a year unless I tag messages for archiving. This goes for both sent and received email. I might want to automatically tag/archive mail with certain addresses. | '''email''': I think email should be stored into a MySQL database when it comes in and purged automatically after about a year unless I tag messages for archiving. This goes for both sent and received email. I might want to automatically tag/archive mail with certain addresses. |
Latest revision as of 18:40, 21 September 2023
What is the best format to keep a given document in?
Source of document: paper
TIFF: this is a scanner output format and can be compressed
Many people use PDF for document storage. There is the Adobe XML - based PDF format. See Adobe Mars project It seems to be going nowhere.
There is the Microsoft XPS format, which is, well, a Microsoft standard and I want to be able to open my documents for many years. Smirk.
I wonder what ODF can do?
- PDF is a doc format, TIFF is more of an image format. Programs to view PDF's are a little more user-friendly and widely available.
- Both standards are pretty much open and universal.
- Sizes?
- PDF's can be encrypted. (It's part of the spec.)
Source of document: digital
This is mostly dictated by the format of the source file, but I am inclined to think I should settle on a few standards and transcode everything into those formats.
audio: mp3 (yes I know it's a copyrighted format but it's ubiquitous) This will include voicemail if I ever go over to an Asterisk PBX here at home.
photo: jpeg or tiff - General rule: do not transcode TIFF to JPEG, which is lossy.
other image files: png - Generally I like png because it allows transparency. gif is for when I want a flying pelican or spinning gears on my loading page!
movie: I have so few movies right now that this is not relevant yet.
text files: I don't want to store formatted text files for long term access in MS-Word format! What format does OO use? 2023 update-- some kind of XML. Fortunately both OO and MS are now cleanly handling each other's formats.
Plain text files should stay that way.
email: I think email should be stored into a MySQL database when it comes in and purged automatically after about a year unless I tag messages for archiving. This goes for both sent and received email. I might want to automatically tag/archive mail with certain addresses.