Unfortunately software of all sorts has a pathological enthusiasm for adding defaulted, wrong metadata to everything. (look into medical charting and drug-dispensing software sometime if you're looking for a cheap scare).
Character-set and language tags are useless in practice, even the dumbest heuristics defeat them. Statistical analysis is so effective that encoding metadata should be forbidden, not required.
Character-set and language tags are useless in practice, even the dumbest heuristics defeat them. Statistical analysis is so effective that encoding metadata should be forbidden, not required.