Sunday, 16 October 2016

How to extract images from a pdf file using C#.Net

In this article, we are going to learn how to extract images from PDF file using itextsharp in asp.net with C#. First, you need to download iTextSharp dll from the internet. Click on the below link to download the dll.

https://github.com/itext/itextsharp

Related Article

  1. How to generate PDF file using iTextSharp in C#
  2. How to export GridView data into PDF using iTextSharp in asp.net with C#
  3. Insert an image into PDF using iTextSharp with C# (C-Sharp)
  4. How to add meta information of PDF file using iTextSharp with C-Sharp

Once file is downloaded, extract it, now you will find 6 more .rar file. Again extract itextsharp-dll-core.rar file, after that add reference of itextsharp.dll to your project.

In Code-Behind File

Add below nampespaces.

using System.IO;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;

Complete C# Code

namespace WebApplication1
{
    public partial class WebForm1 : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
            if (!IsPostBack)
            {
                ExtractImage();
            }
        }

        public void ExtractImage()
        {
            // existing pdf path
            PdfReader reader = new PdfReader("E:/Example.pdf");
            PRStream pst;
            PdfImageObject pio;
            PdfObject po;
            // number of objects in pdf document
            int n = reader.XrefSize;
            FileStream fs = null;
            // set image file location
            String path = "E:/";
            for (int i = 0; i < n; i++)
            {
                // get the object at the index i in the objects collection
                po = reader.GetPdfObject(i);
                // object not found so continue
                if (po == null || !po.IsStream())
                    continue;
                //cast object to stream
                pst = (PRStream)po;
                //get the object type
                PdfObject type = pst.Get(PdfName.SUBTYPE);
                //check if the object is the image type object
                if (type != null && type.ToString().Equals(PdfName.IMAGE.ToString()))
                {
                    //get the image
                    pio = new PdfImageObject(pst);
                    fs = new FileStream(path + "image" + i + ".jpg", FileMode.Create);
                    //read bytes of image in to an array
                    byte[] imgdata = pio.GetImageAsBytes();
                    //write the bytes array to file
                    fs.Write(imgdata, 0, imgdata.Length);
                    fs.Flush();
                    fs.Close();
                }
            }
        }
    }
}
Share:

0 comments:

Post a Comment

Email Subscription

Subscribe to our newsletter to get the latest articles directly into your inbox

Blog Archive

BUY FROM AMAZON