Millions of PDF invisibly embedded with your internal disk paths
I found an interesting privacy issue while analyzing PDF files. This bug occurs when you are using Internet Explorer to print locally saved web pages as PDF and affects all IE versions including IE8. It does not matter which PDF generation software you are using like Adobe Acrobat Professional, CutePDF, PrimoPDF, etc as long as you are invoking it from inside the IE print function. In Windows, even when your default browser is not IE and if you right click a file to select the PRINT from the context menu, then by default it invokes the IE print handler. So, you will still see this issue in the generated PDF.
This bug is NOT ABOUT the local disk path appearing in the FOOTER of your pdf since it is clearly visible and already known by most people. This is easy enough to hide by just going File -> Page Setup -> Change the Footer value from “URL” to “-Empty-”. After doing that, you will not expect your internal disk path being put anywhere else. However, that does not happen.
The privacy issue arises from the fact that your local disk path gets invisibly embedded inside your PDF in the title attribute. Only when you open the file in an Editor like Notepad, you will see it. Currently, there is no option in IE to disable it. The only workaround is to manually nullify this value by editing the PDF file. Note that this problem does not occur when using other browsers such as Firefox and Chrome. In fact, Chrome handles the other footer issue intelligently as well by showing your disk path as “…”, rather than exposing it.
Proof of Concept:
Steps to reproduce:
1. Pick a .HTM or .HTML or .MHT file on your local computer.
2. Open this file in IE and click Ctrl-P.
OR Right-click the file in explorer and select PRINT from context menu.
4. Select any PDF writer as Printer such as Adobe PDF / CutePDF / PrimoPDF / etc.
5. Click Print. When the PDF writer asks for a filename, provide any name.
6. Open the generated pdf in notepad, and search for “file://” without quotes.
Search for this on your favorite search engine (Google/Bing)
filetype:pdf file c (htm OR html OR mhtml)
So, out of 280 million pdfs accessible on the internet, more than 20% look to be exposing internal disk paths which is a huge number. I have contacted the Microsoft and Adobe Security Teams about this issue. Microsoft has plans to fix this in IE9, while Adobe has opened the case but hasn’t planned the timelines yet.
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.0-c316 44.253921, Sun Oct 01 2006 17:14:39"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:format>application/pdf</dc:format> <dc:creator> <rdf:Seq> <rdf:li>LewtasS</rdf:li> </rdf:Seq> </dc:creator> <dc:title> <rdf:Alt> <rdf:li xml:lang="x-default">file://C:Documents and SettingslewtassDesktopeda newsletter</rdf:li> </rdf:Alt> </dc:title> </rdf:Description>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="3.1-701"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/"> <pdf:Producer>Acrobat Distiller 7.0.5 (Windows)</pdf:Producer> </rdf:Description> <rdf:Description rdf:about="" xmlns:xap="http://ns.adobe.com/xap/1.0/"> <xap:CreatorTool>PScript5.dll Version 5.2.2</xap:CreatorTool> <xap:ModifyDate>2009-03-18T15:07:10-07:00</xap:ModifyDate> <xap:CreateDate>2009-03-18T15:07:10-07:00</xap:CreateDate> </rdf:Description> <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:format>application/pdf</dc:format> <dc:title> <rdf:Alt> <rdf:li xml:lang="x-default">mhtml:file://O:femashsp_2009draft ijsfy 2009 investment jus</rdf:li> </rdf:Alt>