PDF Render Garbles Text, but I can't replicate
An acquaintance contacted me and told me that PDFs on my job market website aren't displaying correct. He sent a screenshot, so I know what he is looking at:
http://bensresearch.com/downloads/FF.png
(The PDF he is viewing is http://bensresearch.com/downloads/CV.pdf )
This is my job market site, so I want to make sure anyone can read the PDFs. However, I can't replicate it. I've tried to replicate it on Windows 7 & 8, Mac, with no joy. The built in PDF render in version 24 seems to read the pdf file fine. (I asked him what version he is running and he said version 24 on Windows 7 Enterprise).
Any ideas?
Thanks!
Okulungisiwe
All Replies (9)
Hi tazz_ben, this document looks okay for me, but when selecting and right-clicking, there is some evidence that Firefox's viewer could be having a problem with the font used for small caps. (See attached screenshot.)
Hi rivertube, is there a public link for the PDF?
Could you try opening the document in Firefox's Safe Mode? That's a standard diagnostic tool to bypass interference by extensions (and some custom settings). More info: Diagnose Firefox issues using Troubleshoot Mode.
You can restart Firefox in Safe Mode using
Help > Restart with Add-ons Disabled
In the dialog, click "Start in Safe Mode" (not Reset)
Any difference?
Copied Software Engineer to the clipboard:
S E : S E 
Looks that those bytes are prefixed with F7 bytes to make them act like a x-user-defined charset.
Apparently this is also used in XMLHttpRequest.
From:
After reviewing some UNICODE documents, it seems that the explanation is that the charset x-user-defined uses the UNICODE Private Area 0xF700-0xF7ff to map its range.
- https://developer.mozilla.org/Web/API/XMLHttpRequest/Using_XMLHttpRequest#Handling_binary_data
- https://developer.mozilla.org/Web/API/XMLHttpRequest/Sending_and_Receiving_Binary_Data
The magic happens in line 5, which overrides the MIME type, forcing the browser to treat it as plain text, using a user-defined character set. This tells the browser not to parse it, and to let the bytes pass through unprocessed. var filestream = load_binary_resource(url); var abyte = filestream.charCodeAt(x) & 0xff; // throw away high-order byte (f7)
Okulungisiwe
Hello,
Your document renders improperly for me as well. I'm running Win 7 Enterprise 64bit, with Firefox 24. We also have a document that is rendering improperly at http://msudenver.edu/media/content/admissions/documents/2012-2013_Colorado_Community_College_Transfer%20Booklet.pdf. I upgraded to Firefox 25 and the issue continues to occur for both documents. In the document on my site it seems that the improper rendering is only occurring on lines containing bullet points.
I installed Firefox Nightly and the issue appears to have not been resolved there yet.
What I did notice is that when we download the document and open it with Adobe Reader the improper rendering does not appear.
While not an ideal solution, we put a notice near the document on our site instructing our users to right-click and "save target as" in the event that the document doesn't display correctly.
Hopefully this gets resolved soon.
Okulungisiwe
Hi jrobida, could you post a screen shot of the problem you're seeing? I don't see any garbled text in Firefox 24, but maybe it's being disguised by some other issue.
I'm now running Firefox 25, but the issue was the same in 24. I updated to 25 hoping that it might fix the issue.
Hi jrobida, thanks for the screen shot. I'm not seeing that.
When I select and copy one of the bullets from the PDF Viewer and paste into Word, I get Arial font.
When I select and copy one of the bullets from Adobe Reader and pastd into Word, I get Univers font.
So the PDF Viewer is substituting the font. I don't know how these substitution decisions are made, but it seems on your system something very inappropriate is being chosen. If you paste from the PDF Viewer into a word processor, what font is that you're getting?
When copying the garbled text from the first line of the first bullet into Word 2013 I get Arial. Each word is pasted onto a new line for some reason also.
What's interesting is that when I copy the un-garbled text from the second line of the first bullet I get Verdana (my default font in Word).
The PDF rendering APIs are mainly divided into two categories. One is to render selected PDF pages to image resource and the other is to convert rendered image resource to desired image format.
Okulungisiwe