PDF files are everywhere but not all of them are tagged. A tagged PDF file has some extra elements called tags. These tags are not much different from the HTML tags. If you have ever learned the HTML language (and who hasn’t), then you know about the various HTML tags like the body tag, header tags, paragraph tag, image tag and so on. There are dozens of tags in HTML and they can be used in hundreds of ways.
Similarly, in a tagged PDF file, all the content is contained within various tags. Each of these tags describes which portion of the PDF content goes inside a header tag, which goes inside section tag, which goes inside a paragraph tag, which portion goes inside a link tag and so on. These tags are not visible when you open a tagged PDF inside a regular PDF viewer. However, these tags are very helpful in displaying the PDF file when accessed from large screen readers with assistive technology.
So how do we find out if a PDF file is actually a tagged PDF file? We can use free Sumatra PDF Reader for this:
- First of download and install Sumatra PDF Reader from https://www.sumatrapdfreader.org/.
- Open the target PDF file in Sumatra PDF Reader and then select File → Properties from the menubar. Alternatively, you can also use the hotkey Ctrl+D.
- In the PDF properties window, you will be able to see PDF Optimizations says Tagged PDF if the PDF is actually tagged. If the PDF file isn’t tagged, then you won’t see any such text.
Document editors are Microsoft Word or OpenOffice Writer are able to create a tagged PDF file. Depending on the text formatting, OpenOffice Writer can automatically add various tags to the PDF file. Tagged PDF is not that much larger in file size than an untagged PDF file.